George V. Reilly's Technical Blog

A Use for Octal

I've been programming in C since 1985 and C++ since 1991, but I've never found a use for octal representation until now, aside from the permissions argument for chmod. I've done a lot of bit twiddling: hex has been indispensable and I've often wished for a built-in notation for binary constants (WinDbg uses 0y to denote binary literals and 0n to denote decimal literals (when the radix is not ten)). Octal, however, has always seemed as vestigial as a human appendix.

The other day, a mathematician friend described to me a problem that he had solved at a previous company. They were designing hardware that emulated some old 36-bit computers. For backward compatibility, the various shift instructions had to accept an arbitrarily large shift count and shift left or right by (count mod 36). Now, divisions are not cheap to implement in hardware, so they needed to come up with an alternate approach to calculate the modulus.

My friend tried to do something with the factors of 36: 4 and 9. Four and nine are relatively prime: they have no common factors. By the Chinese Remainder Theorem therefore, the combination of x mod 4 and x mod 9 is enough to uniquely determine x mod 36. By inspection, this is true for the following table.

9 \ 4 0 1 2 3
0 0 9 18 27
1 28 1 10 19
2 20 29 2 11
3 12 21 30 3
4 4 13 22 31
5 32 5 14 23
6 24 33 6 15
7 16 25 34 7
8 8 17 26 35

Mod 4 is easy: it's the two least-significant bits. Mod 9 is not so obvious.

Casting Out Nines

There's an old trick for checking the results of arithmetic operations, known as casting out nines or dropping nines. Add up the decimal digits of each number, modulo 9. They should be congruent.

For example, 1,234 * 5,678 = 7,006,652.

To check, take the sum of the digits of each number. 1+2+3+4 = 10; 5+6+7+8 = 26; 7+0+0+6+6+5+2 = 26.

Take the first two sums, modulo 9, and multiply them: 10 = 1 (mod 9); 26 = 8 (mod 9); 1 * 8 = 8 (mod 9). Check against the sum of the digits of the product, 26: 26 = 8 (mod 9).

This works because 10N = 1 (mod 9). A number such as 321 can be expressed as 3 * 100 + 2 * 10 + 1, or 3 * (9+1) * (9+1) + 2 * (9 + 1) + 1, which is 3 * (9*9 + 2*9 + 1) + 2 * (9 + 1) + 1. Dropping the nines from each term leaves 3 * 1 + 2 * 1 + 1 = 3 + 2 + 1 = 6.

Congruences have a number of useful properties, such as a * b = ( (a mod m) * (b mod m) ) (mod m).

Let's use 11, instead of 9. Since 10 = 11 -1, then 10N = -1N (mod 11). Modulo, by convention, yields a non-negative remainder, but it's convenient to have a negative one here. 100 = (9 * 11 + 1) = 1 (mod 11). Or, 100 = 10 * 10 = (-1 (mod 11)) * (-1 (mod 11)) = 1 (mod 11).

So, 321 = (3 * 1 + 2 * -1 + 1) (mod 11) = 2 (mod 11).

See Divisibility Tests for more examples.

Mod 9

We turn to base 8, octal. Nine bears the same relationship to eight, as eleven does to ten: base plus one. 910 = 118 or 0n9 = 011.

We can calculate mod 9 in base 8 by alternately adding and subtracting the octal digits. 01234 mod 9 = 4 - 3 + 2 - 1 = 2. This gives the right answer. Here's a simple algorithm:

    unsigned
    Mod9(
        unsigned u)
    {
        int m = 0;
        int sign = +1;

        for (unsigned t = u;  0 != t;  t >>= 3)
        {
            unsigned l = t & 07;
            m += sign * l;
            sign = -sign;
        }

        return (unsigned) m;
    }
What about 0607? 7 - 0 + 6 = 13. 13 mod 9 = 4. And 06070? 0 - 7 + 0 - 6 = -13. -13 mod 9 = 5. If the sum dips below zero, or rises above eight, we have to adjust by nine repeatedly, until the sum is in the range [0,9).

Here's a complete algorithm:

    unsigned
    Mod9(
        unsigned u)
    {
        int m = 0;
        bool negative = false;

        for (unsigned t = u;  0 != t;  t >>= 3)
        {
            unsigned l = t & 07;

            if (negative)
            {
                m -= l;
                if (m < 0)
                    m += 9;
            }
            else
            {
                m += l;
                if (m >= 9)
                    m -= 9;
            }
            ASSERT(0 <= m  &&  m < 9);

            negative = !negative;
        }

        return (unsigned) m;
    }

We now have enough to calculate modulo 36:

    unsigned
    Mod4(
        unsigned u)
    {
        return (u & 3);
    }

    unsigned
    Mod36(
        unsigned u)
    {
        static const unsigned char Residues[9][4] = {
            { 0,  9, 18, 27},
            {28,  1, 10, 19},
            {20, 29,  2, 11},
            {12, 21, 30,  3},
            { 4, 13, 22, 31},
            {32,  5, 14, 23},
            {24, 33,  6, 15},
            {16, 25, 34,  7},
            { 8, 17, 26, 35},
        };
        return Residues[ Mod9(u) ][ Mod4(u) ];
    }

My friend says that he later learned that similar tricks were used in classic 36-bit hardware.

I looked everywhere I could think of to see if I could find this algorithm to calculate modulo 9 described. I found something that hinted at it in my 1981 edition of Knuth's Seminumerical Algorithms, referring to a technique to convert decimal to octal by hand. There was no mention of it in Warren's marvelous Hacker's Delight or in HAKMEM.

I tried to come up with an analytic way to calculate the elements of the 9x4 table. The best that I found is (72 - 8 * (u mod 9) + 9 * (u mod 4)) mod 36! The expression yields a number in the range [0,99], which can be reduced to [0,36) by subtracting 36 at most twice. From Concrete Mathematics, mod 36 can be derived from mod 4 and mod 9 by looking at the [0][1] and [1][0] elements of the table: (9 * (u mod 4) + 28 * (u mod 9)) mod 36. It works, but it's even worse. A table lookup is clearly more efficient.

Most, if not all, of the computer architectures designed in the last quarter century use a word size that is a power of 2. Mod 2N is as simple as masking off the low-order N bits. Useful identities like this are one big reason why non-power-of-2 word sizes have gone out of fashion. The other big reason is the success of C and Unix, which have a bias towards 8-bit bytes. C doesn't require 8-bit bytes, but there's a lot of software which assumes that char has exactly 8 bits. On systems with 9-bit bytes, like the 36-bit computers, octal is useful.

And there you have it: a use for octal notation. It's not exactly an important use, but then 36-bit computers aren't exactly important either.

Comments

Bryce said:

36 bit computers are not important unless you use one all the time :-)
# December 15, 2004 3:43 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)