AltSerialization: binary serialization, the smarter way

Note: this entry has moved.

You already know you can count on the BinaryFormatter to serialize/deserialize your objects in binary format. This beast will take care of properly serializing an entire object graph for you without any effort on your part. But if you’re dealing with value types or simple reference types you can do better if you code your own class with specific knowledge in serializing such types, thus performing faster and getting smaller binary representations than using BinaryFormatter. Let’s anticipate some results:

Type

Size in bytes using BinaryFormatter

Size in bytes using custom class

DateTime

59

9

TimeSpan

60

9

Guid

110

17

As you can see there is lot of room for savings. Apparently the ASP.NET team thought the same thing and came up with a little class well hidden in the System.Web.Util namespace: the internal AltSerialization class. Ouch!, internal? Yes, *internal*.

What does AltSerialization do?

This really simple class will write and read values in binary format, with special knowledge on how to deal with the following 14 types: Boolean, Byte, Char, DateTime, Decimal, Double, Int16, Int32, Int64, Single, String, UInt16, UInt32, UInt64.

How does it do this?

When writing a value it will first examine its type and based on it performs an optimized serialization. Let’s take for instance a DateTime: at the heart of this value type is the ticks member variable representing the number of ticks that have elapsed since 12:00AM, Jan. 1, 0001. Writing only the ticks value should be well enough for being able to reconstruct the DateTime instance later on; and… ticks is an Int64 (8 bytes in length), much less than the 59 bytes spitted by the BinaryFormatter!. Of course we still need some way to identify that what we’re storing is a DateTime and not another type so we can properly deserialize it later. This means that each type that AltSerialization knows how to properly serialize will be prefixed by a type code (taken from the nested AltSerialization.TypeID enumeration). This is a byte enumeration so it will only add one byte to every persisted type (the persisted DateTime ends up taking up 9 bytes, exactly 50 bytes less than the BinaryFormatter).

In case you’re wondering why the System.TypeCode enumeration wasn’t used –after all AltSerialization.TypeID is almost duplicating it- I believe the answer is space. While the first one uses an Int32 as its base type the second one uses a byte.

Lastly, if AltSerialization doesn’t know how to serialize a given type it will just pass it along to a BinaryFormatter.

Getting clever from v1.0 to v1.1

It’s nice to see how AltSerialization got a bit clever in version 1.1 of ASP.NET, adding support for serialization of five additional types: Guid, IntPtr, SByte, TimeSpan, UIntPtr.

Suspicious array usage

While I was looking inside AltSerialization I found something interesting in its private static constructor. In there, a static Type array is initialized to hold the types the class knows how to custom serialize. These are the types listed in the AltSerialization.TypeID enum. The funny part is that the array is initialized with two extra slots and the first item is stored at index 1 (slot zero is not used at all). While I’ve no idea why slot zero was not used (I can only smell vb.net for this) I think I may have a clue about why there are two extra slots. My guess is that the developer first counted the types listed in the enum -where null and object are included- and then made an array large enough to hold all items; later on, he realized that there is no sense in comparing if a type is of type object and there isn’t much sense in storing null as one of the supported types. My theory says that after realizing this, the developer just forgot to reduce the array’s size. What’s yours?

Ringing bells

I’ve checked a couple of released projects and they don’t seem to be using any AltSerialization-like approach. I don’t know why. Maybe I should start pinging them about this.

2 Comments

Comments have been disabled for this content.