Configuration with XmlSerializer + codegen (i.e. xsd.exe): how "fix" PascalCase to camelCase for the XML
Note: this entry has moved.
Most people (including myself, as you may have noticed from my previous post on XmlSerializer-based configuration) believe an automated XSD->Classes->XmlSerializer config section handler is the optimum approach for dealing with configuration in .NET. You basically start definining your configuration in an XML Schema, then use xsd.exe or another even more cool approach, and finally get XmlSerializer-ready classes for it. Loading configuration afterwards is just a matter of deserializing it. Cool right?
Well, no matter how hard I try, the following sample config looks just awful to me (and believe me, I've created many like this one):
<SampleConfig>
<MySetting Type="TheType, TheAssembly">
<Enabled>True</Enabled>
<MakeItFast>True</MakeItFast>
</MySetting>
</SampleConfig>
And you know XAML goes the same route, using .NET naming
convensions for elements and attributes...
Well, not only this looks pretty ugly (IMO) to any XML geek,
but it also sort of unveils you're using
XmlSerializer for loading it, and for some
guys, this may seem like a less professional approach to
configuration (don't ask me why). Learning yet another
codegen tool just to overcome this "minor" casing issue
sounded overkill for me. On the other hand, I didn't want to
write all my configuration loading by hand anymore,
specially once I got used to the incredibly fast refactoring
that is enabled by the combination of
XSD->CodeGen->XmlSerialier: all refactorings in the
schema reflect on the classes, and you get inmediate
compile-time errors in the code depending on old element
names, or now inexistent or relocated elements. This is
crucial for evolution.
Another aspect the XmlSerializer-based approach enables is a full programmatic typed configuration API you can use to write changes back to the file. And you don't have to write almost any code! Summarizing, the benefits are: straightforward loading and storing back. MAIN drawback: you have to use the XML-ugly .NET casing. Don't you?
Well, I decided I wanted the best of both worlds, as usual. I wanted camelCase-style elements and attributes, but XmlSerializer-based configuration loading and saving. And I definitely didn't want to write/use yet another codegen extension/tool to do the dirty job of adding all those XML serialization attributes to account for this "little" issue of casing.
So the answer is pretty simple once you know it: write a
custom reader that turns the first letter into uppercase as
it reads, and a custom writer that turns it back to lower
upon saving it back. What's more, the reader should be based
on the XmlTextReader so I can still layer the
XmlValidatingReader once the "case
normalization" happened, so the on-the-fly-PascalCased
document is valid against the schema.
With the two new classes,
XmlFirstUpperReader and
XmlFirstLowerWriter, you read and write back
XML files with the camelCase style, while having the XSD
schema and the generated classes believe you have the proper
PascalCase source/destination. So you can turn the previous
config file into the nicer format:
<sampleConfig>
<mySetting type="TheType, TheAssembly">
<enabled>True</enabled>
<makeItFast>True</makeItFast>
</mySetting>
</sampleConfig>
Without changing your classes or schema. The trick is done when you pass the source for deserialization:
After modifying the settings class, you can save it back into the file with the proper camelCase by using the custom writer:
Pretty straightforward, right?
Implementation
This time, I didn't
"cheat" the XmlTextReader, but directly used it by inheriting it like any decent OO
citizen, and overriding the members I needed to change. It
turned out to be a little more work than I thought because I
had to use the base reader
XmlNameTable everywhere, or the XSD validation
would stop working and values would be lost somehow. The
XmlFirstUpperReader has a private method that
turns a name into uppercase, and adds it to the
NameTable before returning. This method uses
the optimal "algorithm" discussed in
my previous post:
The XmlFirstLowerWriter implements the opposite
operation:
This last method is made internal and static because it
doesn't depend on an XmlNameTable instance, and
because it must be used by the reader in some places, as
explained next. Note that both only modify the string if
it's actually necessary.
The XmlTextReader properties that need to be
overriden are:
-
this (indexer): this property is used to access attributes of the current element by name. Client code as well as the validating reader (based on the schema) will expect to access attributes by their PascalCase name. So, I need to ensure the name is camelCased before the access occurs:public override string this[string name, string namespaceURI] { get { return base[ NameTable.Add(XmlFirstLowerWriter.MakeFirstLower(name)), namespaceURI]; } } -
LocalName: this property needs to be properly capitalized only for elements and attributes (you don't want to get the <? xml ?> directive changed!), but only if they are not namespace declarations:public override string LocalName { get { // Capitalize elements and attributes. if ( base.NodeType == XmlNodeType.Element || base.NodeType == XmlNodeType.EndElement || base.NodeType == XmlNodeType.Attribute ) { return base.NamespaceURI == XmlNamespaces.XmlNs ? // Except if the attribute is a namespace declaration. base.LocalName : MakeFirstUpper(base.LocalName); } return base.LocalName; } } -
Name: this property returns the qualified name of the element. That is, it can contain a prefix plus the name. The former needs to remain the same, while the later is capitalized. In addition, for namespace declaration, it returns the prefix, which we don't want to capitalize.:public override string Name { get { // Again, if this is a NS declaration, pass as-is. if (base.NamespaceURI == XmlNamespaces.XmlNs) return base.Name; // If there's no prefix, capitalize it directly. if (base.Name.IndexOf(":") == -1) return MakeFirstUpper(base.Name); else { // Turn local name into upper, not the prefix. string name = base.Name.Substring(0, base.Name.IndexOf(":") + 1); name += MakeFirstUpper(base.Name.Substring(base.Name.IndexOf(":") + 1)); return NameTable.Add(name); } } }
On the methods side, there's only one to override:
-
MoveToAttribute(): movement must be done using the lower cased name, as the underlying XML uses this "version":public override bool MoveToAttribute(string name, string ns) { return base.MoveToAttribute( NameTable.Add(XmlFirstLowerWriter.MakeFirstLower(name)), ns); }
The XmlFirstLowerWriter implementation, in
turn, is based on the XmlTextWriter one,
overrides just WriteQualifiedName,
WriteStartAttribute and
WriteStartElement. All three methods a
localName parameter which is turned into
camelCase by a call to the
MakeFirstLower method shown above.
The full Mvp.Xml project source code can be downloaded from SourceForge.
Enjoy and please give us feedback on the project!