Quick Reference : Base64 and UUEncode

Base64 and uuencode allow us to send arbitrary binary data through systems that only allow plain ASCII (e.g. Email RFC 822).

 

Base64

 

It takes 3 bytes and converts them into 4 printable and humanly readable ASCII charcters. It does that by first grouping the 3 bytes into 4 groups of 6-bits each and then using an encoding table to convert the values to text. Why 6-bits and not the whole 7-bits of ASCII, well that's because Base64 only uses the printable and humanly readable characters of ASCII for encoding.

 

Since Base64 only used 6 bits to encode, thus the total possible encodings are 2^6 = 64, hence the name Base64 encoding.

 

E.g.

Suppose we need to convert the following 5 bytes 220,230,210,255,240

 

1. Convert to Binary

 

220 -> 11011100

 

230 -> 11100110

 

210 -> 11010010

 

255 -> 11111111

 

240 -> 11110000

 

Thus the sequence of bits looks like

 

1101110011100110110100101111111111110000

 

 

2. Group into groups of 6 bits each

 

110111

001110

011011

010010

111111

111111

0000?? <- Not right

 

Ahh so we run into a problem, since we are trying to encode a sequence of 5 bytes which is not a multiple of 3, we run into a problem while grouping into 6-bits. Base64 solves this problem by adding 0 byte padding to the initial sequence to make it a multiple of 3.

 

Thus in our case our original sequence will now look like this

 

 220, 230, 210, 255, 240, 0

 

And our sequence of bits is going to look like this

 

110111001110011011010010111111111111000000000000

 

And our grouping now correctly becomes

 

110111 -> 55

001110 -> 14

011011 -> 27

010010 -> 18

111111 -> 63

111111 -> 63

000000 -> 0

000000 -> 0

 

 

3. Convert the values to characters

 

For that we need to look at the Base64 encoding table

 

Value

Encoding

Value

Encoding

Value

Encoding

Value

Encoding

0

A

16

Q

32

g

48

w

1

B

17

R

33

h

49

x

2

C

18

S

34

i

50

y

3

D

19

T

35

j

51

z

4

E

20

U

36

k

52

0

5

F

21

V

37

l

53

1

6

G

22

W

38

m

54

2

7

H

23

X

39

n

55

3

8

I

24

Y

40

o

56

4

9

J

25

Z

41

p

57

5

10

K

26

a

42

q

58

6

11

L

27

b

43

r

59

7

12

M

28

c

44

s

60

8

13

N

29

d

45

t

61

9

14

O

30

e

46

u

62

+

15

P

31

f

47

v

63

/

 

 

From this we get

 

55 -> 3

14 -> O

27 -> b

18 -> S

63 -> /

63 -> /

0  -> A

0  -> =

 

Note: '=' is the value used for the padding value 0.

 

Thus our initial sequence of 5 bytes is now Base64 encoded as

 

3ObS//A=

 

The .Net framework exposes this functionality primarily via the

 

Convert.ToBase64String() and

Convert.ToBase64EncodingArray() methods

 

So this line of code

 

System.Console.WriteLine(Convert.ToBase64String(new byte[]{220,230,210,255,240}));

 

would print

 

3ObS//A=

 

 

UUEncode

 

UUENCODE stands for (Unix-to-Unix) encoding, it was the predominant system for binary to text encoding before base64 and MIME, I looked up Google to see what were the disadvantages of uuencode, it seems that uuencode depended on the code page of the current locale to encode the data, so if data was being transferred between systems having identical code pages then it worked fine but it broke when the two systems exchanging data used different code pages. The assumption being that the conversion would be taken care of the gateways that did the content transfer, but that was patchy at best. ( Not too sure of this since this was before my time )

 

Uuencode encoding is quite similar to Base64, you first convert 3 bytes into 4 bytes by grouping into groups of 6-bits each and then making each 6-bit group into a byte by adding 2 zero bits to the front, next we add 32 to bring the byte into the printable and humanly readable range of 32 – 95, next we encode the byte into ASCII characters using the standard ASCII table. The padding byte used in this case is 1(0x01)

 

E.g.

 

Suppose we want to encode a file named test.dat, which happens to contain only one byte 254

 

1. Convert to Binary

 

254 = 11111110

 

2. Group into groups of 6-bits each

 

Add 2 bytes of padding to round off input to multiple of 3 (Padding is 0x01)

 

Thus input bit stream becomes

 

11111110 00000001 00000001

 

Thus our grouping becomes

 

111111

100000

000100

000001

 

3. Add two zero bits to get full byte

 

111111 -> 00111111 = 63

100000 -> 00100000 = 32

000100 -> 00000100 = 4

000001 -> 00000001 = 1

 

4. Add 32

 

Thus our bytes now become

 

95,64,36,33

 

5. Encode using standard ASCII table

 

 

Value

Encoding

Value

Encoding

Value

Encoding

Value

Encoding

32

 

48

0

64

@

80

P

33

!

49

1

65

A

81

Q

34

"

50

2

66

B

82

R

35

#

51

3

67

C

83

S

36

$

52

4

68

D

84

T

37

%

53

5

69

E

85

U

38

&

54

6

70

F

86

V

39

'

55

7

71

G

87

W

40

(

56

8

72

H

88

X

41

)

57

9

73

I

89

Y

42

*

58

:

74

J

90

Z

43

+

59

;

75

K

91

[

44

,

60

< 

76

L

92

\

45

-

61

=

77

M

93

]

46

.

62

> 

78

N

94

^

47

/

63

?

79

O

95

_

 

 

95 -> _

64 -> @

36 -> $

33 -> !

 

That’s basically how uuencode works, a couple of things to note about the output file format

 

1.       The first line will be ‘begin <unix file access mode> <filename>’ and two extra lines are used to indicate end-of-file, the second last line has a single byte 0x20 and the last line will contain ‘end’.

 

2.       The first charcter of each line is a count of the number of bytes encoded in that line, for e.g if a line has the full 60 charcters (limit imposed by early unix email clients) then that means 60*4/3 = 45 bytes have been encoded. Add 32 to 45 (to get a printable charcter) and we get 77 = M, thus you will find all lines except the last begin with M.

 

Thus our file will be encoded as

 

begin 664 test.dat

_@$!

 

end

 

 

In the .Net framework, you can choose to encode your mail attachments using uuencode by passing MailEncoding.UUEncode value to the Encoding property of the MailAttachment object.

 

1 Comment

Comments have been disabled for this content.