## A first stab at BaseN encoding with a focus on general alphabet encoding.

The comments in the code-only article are fairly decent, but I dislike being extremely verbose in my commenting because then I can't see my code. A little explanation of the problem is probably in order because of the lack of extremely verbose comments. First, what is base N encoding or alphabet encoding?

Most people assume that encoding into any base in some way equates to mapping a number to some digits, plus some additional characters to represent values we don't have digits for. This isn't always the case. An integer encoded as Alphabet{0,1} = 1001 = 9 decimal is identical to Alphabet{+,-} = -++- = 9 decimal. I've just change the represenation or alphabet, but the base is still the same (aka base 2).

Explaining bases could take a few years of college courses, as you take the concepts and create increasingly more abstract versions of them. In fact, bases are strange things in some theoretical maths where concepts of groups, colors, stripes, and other words are used to describe how they work. A very simplistic view of the base is available over on Mathworld. In general though, the concept is that any base has a number of digits equal to the base number b (aka radix) where the digits represent the values 0 through b-1. That is easy enough, and it gives us a very generic method for converting a number to any alphabet and back.

To start, we'll denote an alphabet as a char[] of digits. Digit in this sense is any character that will represent the array index at which it is placed. The base of the alphabet is the length of the character array. The first element in the array at offset {0} has a value of 0 and for all other indices n greater than 0 the value of the digit at n is equal to the index n. That's all there is to it. Any alphabet of characters can now be translated to and from an integer using this mapping table and the base.

Code-Only: Arbitrary alphabet encoding (aka BaseN encoding) for base2 through base36.

Published Sunday, November 07, 2004 3:45 PM by Justin Rogers
