Which one is the better XML serializer?

There is quite a bit of confusion when developers new to .NET try to serialize objects to XML. There are two serializers available to do the job, but neither one offers a universal solution. The first one is the XmlSerializer in the System.Xml.Serialization namespace. The XmlSerializer should have probably been named something like XmlObjectBinder because it's primary intent is to bind strongly structured XML data to .NET objects, not serialize objects for persistant storage.

The XmlSerializer serves its purpose in life reasonable well, as long as the XML strongly structured (i.e. no mixed content), its format can be described by an XML Schema and the serialized classes meet the rather stringent criteria of the XmlSerializer. These compatibility criteria make it completely unsuited to serialize arbitrary object graphs. The XmlSerializer should primarily be to serialize classes that were specifically designed for that very task.

The second serializer is the SoapFormatter in the System.Runtime.Serialization framework. It is used by the remoting infrastructure to transmit arbitrary object graphs embedded in Soap messages. Unlike the XmlSerializer, it does not focus on the generated XML format at all. The SoapFormatter will (almost) always produce a format according to Section 5 of the Soap 1.1 specification. Furthermore, the SoapFormatter will always produce entire SOAP messages, complete with and tags, and provides only very limited options to customize the generated format.


Customization of the generated XML format is not so much an issue because the intended use for the SoapFormatter assumes that performs both, serialization and deserialization of an object is performed by the SoapFormatter. This approach limits interoperability considerably.


Another aspect where the two solutions differ is performance. Microsoft designed the XmlSerializer for high performance and moved all the expensive computing operations into the constructor. The constructor of the XmlSerializer expects you pass it information about all the types that you will hand it at runtime to serialize. It then analyzes the types up front and internally compiles classes to serialize and deserialize. As a consequence, Serialize() and Deserialize() are very fast operations because the serializer executes pre-compiled code. There is no need to repeatedly analyze the structure of the serialized types. The initial construction of an XmlSerializer instance on the other hand carries a lot of overhead.


The SoapFormatter behaves the opposite way; the constructor does not require any information about the types it will process. One SoapFormatter instance can format any given type in your program. In return it has to analyze the structure of each object as it serializes it. Compared side-by-side serializing identical objects, the XmlSerializer is up to 6 times faster than SoapFormatter for very simple classes, as long as you discount the initial overhead of creating and compiling the serialization helper classes for the XmlSerializer. For more complex classes the performance gap between the two serializers grows wider. The performance advantage makes the XmlSerializer the solution of choice for high throughput applications, like ASP.NET WebServices and the XmlMessageFormatter for message queues.


The SoapFormatter on the other hand is the better choice if you just want to persist arbitrary objects in a human readable format. Keep in mind though that you should not require any specific formatting of the serialized objects if you serialize with the SoapFormatter. And you may have to strip the Soap-related markup if it would mess up your application.


Comments have been disabled for this content.