Thom Robbins is a great guy, but unfortunately for him he has bumped into one of my major pet peeves, Viral Coding Examples with his Introducing the XmlTextReader post. It really isn’t his fault, since the code he uses is very similar to the code example in the XmlTextReader.Read() documentation, and I complained about that code to the System.Xml team at the MVP Summit. I did promise to write something up on it, and Thom’s post finally got me to do it (it has been months since I promised to write this up).
The problem is in the structure of the code:
Dim xmlFileStream As New FileStream("cust.xml", FileMode.Open)
Dim xmlRead As New XmlTextReader(xmlFileStream)
While xmlRead.Read
xmlRead.MoveToContent()
If xmlRead.HasValue Then
MsgBox(xmlRead.Value)
End If
End While
xmlRead.Close()
xmlFileStream.Close()
At first glance the code looks perfectly fine. But, knowing full well that some developer new to System.Xml will be using this code as a template for bigger things we have an obligation to make it easier for them to adapt this code without causing “strange” errors.
Problem #1
No explicit setting of the WhitespaceHandling option. Unless the developer is familiar with XmlTextReader (which shouldn't be in this case), they would not know that the default is WhitespaceHandling.All, which causes the reader to return all Whitespace and SignificantWhitespace nodes (which will definitely confuse the developer). So after the declaration of the xmlRead variable you should set the WhitespaceHandling property.
xmlRead.WhitespaceHandling = WhitespaceHandling.None
At least now the developer realizes that there is a property for WhitespaceHandling, and will/can change it as needed.
Problem #2
Implicit Control of Reads in While Loops. My biggest problem with the code examples used to XmlTextReader has to do with the While xmlRead.Read loops. Although it looks very harmless, the while loop that executes a read at the beginning (or the end) of a looping structure will cause bugs to creep into the code because of the way other methods on the XmlTextReader handle the cursor used to point to the current node. If all you do is execute Reads via the while loop, you are perfectly fine. But once you add code that manipulates the cursor from within the while loop, now you run the chance of skipping nodes accidentally.
Here’s a great example. You have an XML stream that looks like this:
<ROOT>
<LEVEL1>
<LEVEL2>1st Level2 text node</LEVEL2>
<LEVEL2>2nd Level2 text node </LEVEL2>
</LEVEL1>
</ROOT>
And you want to print out the contents of the elements level2, so you modify the standard code example to look like this:
While xmlRead.Read
xmlRead.MoveToContent()
If xmlRead.IsStartElement() then
If xmlReader.Name = “level2” then
MsgBox(xmlRead.ReadInnerXml())
End If
End If
End While
And you know what, it works fine. But say the XML stream does not have all that pretty whitespace, or that they took my advice and set the WhitespaceHandling property (in this case to None). Now the code doesn’t work, since you had a bug in your code and you didn’t know it. What? How is that? Well, the ReadInnerXml method reads to the first node past the EndElement. In the case of the XML Stream with the nice formatting (and when WhitespaceHandling is All) the next node is a whitespace node, and when the while loop fires the Read method, all is well and the cursor is moved to the next Node (which should be the next StartElement, otherwise the MoveToContent method will move you to the next content node (which is any node that is non-white space text, CDATA, Element, EndElement, EntityReference, or EndEntity)). But without the whitespace nodes to stop the ReadInnerXml method the cursor is positioned at the next non whitespace node (which in this case is the StartElement for level2) and then the while loop fires the Read method, and when we enter the loop we have now read past the StartElement and the if condition is not met (and we skip the whole element).
So, what is a better example for an introduction to the XmlTextReader? Explicitly control when a Read is executed.
XmlRead.WhitespaceHandling = WhitespaceHandling.None
Dim Continue as Boolean
If xmlRead.Read = False then
Continue = False
End If
While Continue
If xmlRead.IsStartElement then
If xmlRead.Name = “level2” then
MsgBox(xmlRead.ReadInnerXml())
Else
Continue = xmlRead.Read()
End If
Else
Continue = xmlRead.Read()
End If
End While
Now we have explicit control over when a Read is executed, and in the case of rogue methods that place your cursor to the next node (that you haven’t tested yet), you can skip the implicit Read.
If you want, you can download a fully functional example with 5 different test cases.
The preceding blog entry has been syndicated from the DonXML Demsak’s All Things Techie Blog. Please post all comments on the original post.
I’ve been trying (with help from ScottW and the local MS DEs) to start up a new user group in NJ, but I’ve run into a couple problems, the most important is the lack of a standard meeting date that does not intrude on the pre-existing user groups. The problem is a good one to have, since it means that the developer community is of sufficient size to support more focused user groups (rather than the typical general purpose groups). But by focusing the user group on one topic, it also limits its potential audience, so we need to make it available to a larger group of developers. What I was thinking of doing is to create 3 new user groups (or one big one with 3 different tracks) which all meet at the same location, just separate rooms. The 3 groups would be an Asp.Net group, a SQL Development DBA group (for folks who write sprocs and DTS packages, OLAP and such, geared towards the new Yukon dev stuff) and a traditional SQL Server Support DBA group (for the traditional backup/recovery/performance stuff). Then, instead of meeting every month on a weeknight, meet once a quarter on a Saturday morning and have 3 presentations per group (or track) which would mean 9 sessions in total. The idea is that most folks can't travel more than 20 miles on a weeknight because that is about an hour commute time (thanks to traffic), so a Saturday morning (with no traffic) would mean that people could travel further. But who wants to give up 1 Saturday morning a month? So if we have 3 sessions on a quarterly basis we could meet once a quarter, and still cover the same amount of material, and only have to give up 1 Saturday a quarter.
I’ve tried to find out what others areas are doing to solve this problem, but I haven’t found others that have run into this (I would think that only high density populations of developers, like Silicon Valley, would have hit this yet). Anyone have feedback for me on this? I guess trying to keep a community active in a group that meets only once a quarter may be an issue, but with a proper community site, and support from the monthly user groups, I’m inclined to believe that this shouldn’t be a problem.
Also, does anyone know of a User Group that is a member of PASS and Ineta? I would think that with Yukon coming next year, more groups will be registered with both user communities.
The preceding blog entry has been syndicated from the DonXML Demsak’s All Things Techie Blog. Please post all comments on the original post.
I’ve been extremely busy since coming back from Vegas, and even caught a little flack for not posting more quality stuff (and I agree, the quality isn’t there at the moment, but just wait until you see what I’ve been working on).
I ran across this little tidbit a couple weeks ago, and I wasn’t going to post it because I didn’t want to start another VB.Net versus C# thread, but I think it shows some of things done in the name of backward compatibility with VB6 which are helping kill a perfectly good language (VB.Net). I’ve got a bunch of other stuff that VS.Net does to “help” the VB.Net programmer, but only succeeds in making it hard for them to produce enterprise ready code, but this “flaw” is in the complier not the IDE.
My currently client has requested that the code be done in VB.Net, so I’m living a world trying to make VB.Net adhere to the same coding styles as C# (no VB only functions, using namespaces, no BAS files, good OO and Domain Driven Design (well sort of)) and fighting the IDE the whole way. I was disassembling one of our libraries and noticed a reference to VisualBasicMicrosoft.VisualBasic even though I specifically removed the default import of that namespace. I was curious as to why that was happening and noticed that it was only in the Try Catch statements. I thought that maybe it was something I was doing so I created 2 projects, one in C# and one in VB.Net, with one class, and a simple Try Catch in each
C#
using System;
public class Class1
{
public Class1()
{
try
{
Array a;
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
throw (ex);
}
}
}
VB.Net
Imports System
Public Class Class1
Public Sub New()
Try
Dim a As Array
Catch ex As Exception
Console.WriteLine(ex.Message)
Throw (ex)
End Try
End Sub
End Class
You would think that both sets of code would compile down to the same IL, but they don’t.
C# IL
.method public hidebysig specialname rtspecialname
instance void .ctor() cil managed
{
// Code size 23 (0x17)
.maxstack 2
.locals init ([0] class [mscorlib]System.Array V_0,
[1] class [mscorlib]System.Exception ex)
IL_0000: ldarg.0
IL_0001: call instance void [mscorlib]System.Object::.ctor()
.try
{
IL_0006: leave.s IL_0016
} // end .try
catch [mscorlib]System.Exception
{
IL_0008: stloc.1
IL_0009: ldloc.1
IL_000a: callvirt instance string [mscorlib]System.Exception::get_Message()
IL_000f: call void [mscorlib]System.Console::WriteLine(string)
IL_0014: ldloc.1
IL_0015: throw
} // end handler
IL_0016: ret
} // end of method Class1::.ctor
VB.Net IL
.method public specialname rtspecialname
instance void .ctor() cil managed
{
// Code size 29 (0x1d)
.maxstack 2
.locals init (class [mscorlib]System.Array V_0,
class [mscorlib]System.Exception V_1)
IL_0000: ldarg.0
IL_0001: call instance void [mscorlib]System.Object::.ctor()
.try
{
IL_0006: leave.s IL_001c
} // end .try
catch [mscorlib]System.Exception
{
IL_0008: dup
IL_0009: call void [Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.ProjectData::SetProjectError(class [mscorlib]System.Exception)
IL_000e: stloc.1
IL_000f: ldloc.1
IL_0010: callvirt instance string [mscorlib]System.Exception::get_Message()
IL_0015: call void [mscorlib]System.Console::WriteLine(string)
IL_001a: ldloc.1
IL_001b: throw
} // end handler
IL_001c: ret
} // end of method Class1::.ctor
The VB.Net IL has one distinct addition to the IL within the Catch block there is a call to the VisualBasic dll, SetProjectError. Why would the VB Team add this call to their compiler? Backward Compatibility with VB6. As per Niklas (from the VB Compiler team:
“The extra two calls are there to support the "On Error" language feature that was retained to make it easier to upgrade from VB6 to VB.NET. … they only cost you time (and very little) if an exception actually happens. The time for the two calls is minor compared to the overhead of propagating exceptions.”
My problem with this is that you get this even if you are not using the old Or Error syntax. There is no reason why this can’t be a compiler option, or even better yet, let the compiler figure out is On Error is used and act accordingly. Because C# is not (currently) hindered by backward compatibility, it can avoid such issues (for now). I know this really isn’t that big of a deal, in terms of performance, it is just a VB mindset issue that helps to promote the idea that VB.Net is a second class language (which is re-enforced by little things like this).
[Corrected the VB IL code, since it was compiled with the debug option]
The preceding blog entry has been syndicated from the DonXML Demsak’s All Things Techie Blog. Please post all comments on the original post.