XmlTextWriter + StringWriter = Headache
I've come to the conclusion that .NET doesn't really make coding easier (yet), because most Framework classes are incomplete, and use Inheritance as an excuse to leave them that way. Case in point: XmlTextWriter.
I'm changing GenX.NET's XML formatter to use the XmlTextWriter instead of building XML manually. It's a bit cleaner this way, and I can use formatting to overcome this really weird issue I've been having with the StringBuilder.ToString method putting in breaks every 1024 characters. More on that later. Anyways, so the XmlTextWriter constructor takes an instance of the StringWriter class, which is where the problems begin. The XmlTextWriter's constructor looks like this:
Sub New(StringBuilder)
Sub New(Filename, Encoding)
Sub New(System.IO.TextWriter)
The lameness begins. So you can't set the encoding of the XML document if you pass in the StringBuilder. Sucks to be me. So I whip open the Object Browser, navigate to the XmlTextWriter, and I get the following pearl of wisdom:
Public Sub New(ByVal w As System.IO.TextWriter)
Member of: System.Xml.XmlTextWriterSummary:
Creates an instance of the XmlTextWriter class using the specified System.IO.TextWriter .Parameters:
w: The TextWriter to write to. It is assumed that the TextWriter is already set to the correct encoding.
Well, this would be a fabulous assumption to make, save for one thing... The TextWriter's encoding property is READ ONLY. La-de-frickin-da. Time to add bloat to my codebase again.
So I do a GoogleSearch on “XmlTextWriter StringBuilder Encoding”, and I get Roy Osherove talking about the subject. The dude knows XML & .NET, so I'm thinking “Great”.... but no dice. The examples in the comments don't work. The 2nd sample freaks out IE because the IE XSLT parser can't hack it if there are spaces at the end of the file. For some reason, converting a MemoryStream's buffer to a string kicks out extra data at the end. This is very bad. 45 minutes wasted.
The 1st example does not exactly work, because it doesn't allow for a StringBuilder to be passed in. This one is simple enough to correct, I just hate adding unnecessary code to my object model. The solution looks like this:
StringWriterWithEncoding Class:
XML Parser Class:Imports System.IO
Imports System.TextFriend Class StringWriterWithEncoding
Private m_encoding As Encoding Public Sub New(ByVal sb As StringBuilder, ByVal encoding As Encoding)
Inherits StringWriter
MyBase.New(sb)
m_encoding = encoding
End Sub Public Overrides ReadOnly Property Encoding() As Encoding
Get
Return m_encoding
End Get
End PropertyEnd Class
Protected Friend Overridable Function DataReader(ByRef FromDataReader As SheetBuilder.FromDataReader) As String Implements IFormatProvider.DataReader Dim i As Integer
Dim sb As New StringBuilder
Dim writer As New XmlTextWriter(New StringWriterWithEncoding(sb, Encoding.UTF8)) writer.Formatting = Formatting.Indented
writer.WriteStartDocument() writer.WriteStartElement(“document“))
writer.WriteElementString(dr.GetName(i), HtmlEncode(dr.GetValue(i).ToString))
writer.WriteEndElement()writer.Flush()
Return sb.ToString
writer.Close()
End Function
There you have it. Now you can add whatever encoding you want, and the StringWriter will compensate accordingly. Notice also that the XmlTextWriter DOES NOT compensate for things like Ampersands (&) and so forth. I decided I'd take the burden off of the end user, and sacrifice a little performace by HtmlEncoding the output, rather than risk a document breaking and having to deal with a support issue.
Hopefully, MS will fix that stupid ReadOnly property and make it a two-way street, like they did with the SelectedValue property in the DropDownList. For now I'll have to use mine.