February 2004 - Posts
Note: this entry has moved.
Yesterday I received the following
important notice:
OASIS Emergency Management Technical Committee have approved a Committee Draft specification for the "Common Alerting Protocol Version 1.0"
I guess sooner or later we'll have another long awaited spec:
OASIS Bathroom Contention Technical Committee have approved a Committee Draft specification for the "Common Avid Bathreader Syncronization Protocol Version 1.0"
:S
Note: this entry has moved.
Most of the APIs in .NET have a layered design, most notably the IO classes.
System.Xml namespace builds on this layered design, but falls short, IMO.
Basically, you have the following layers in a typical XML parsing activity:
-
Input: a
System.IO.Stream implementation, such as a FileStream,
BufferedStream, NetworkStream
and so on, or directly a string passed to the next layer.
-
Basic reader: most probably a
System.IO.StreamReader, or a StringReader
if the previous layer is skipped.
-
XmlReader: the actual parser implementation. In .NET, the XML parser is the
XmlTextReader
class.
Maybe it's just me, but isn't there a layer missing there? The "Lexical
Analyzer" or "Scanner"? Well, it turns out that's it's missing to the public,
but the XmlTextReader of course uses one, its XmlScanner. Wouldn't
it be cool if this layer was exposed explicitly, so that you could tell the
parser which scanner to use? Imagine that an imaginary scanner could
present as XML tokens some binary stuff comming from the basic reader layer...
I know all the discussions about binary XML, I'm just thinking about the clever
solution for SVG, SVGZ or "zipped SVG". I don't have to tell you how well the
zip algorithm is in general, but with highly redundant data such as XML (i.e.
all the repeated tag names) the size reduction is really awesome.
Back to the topic, however, the XmlTextReader violates this
separation with its internal XmlScanner class. Namely, the scanner
BUFFERS its reads, instead of delegating this responsibility to the appropriate
layer, which already implements such funcionality in the BufferedStream
class. One consequence of this violation is that the stream position is no
longer relevant as you will never know how far the internal scanner has gone.
Have you ever dreamed of a "ResetableReader"? You can kiss that dream goodbye
for now.
If the scanner didn't violate the separation, we could implement such a reader
as follows:
-
Read until some arbitrary point.
-
Store current stream position.
-
Create a new reader to read starting from current position (one that stops
reading when it finds elements "outside" its scope), and use it internally
instead of advancing the "real" one.
-
Upon a call to a
Reset() method, discard the "inner" reader and
reposition the stream.
So, we could confidently hand such a reader to some arbitrary component to do
whatever it has to do with the data, without risking our own positioning in the
reader. This is typical in XML processing pipelines. You don't want the
previous pipeline to mess with the "real" reader and break processing in later
ones. Similarly, if you configure components to handle processing of certain
elements (for example, with the handler registration mechanism
allowed by Xml
Streaming Events), you don't want one handler to screw the reader
and forbid other handlers from doing their work. You could have the following
syntatic sugar also:
ReseteableReader rr; //Initialize somehow
// Do some reading
// We're about to hand the reader to some other component
using (rr.CreateResetPoint())
{
Process(rr);
}
// Now we're exactly where we left before entering the "using"
But as the scanner is buffering (something that should be left to the lower
layer, as stated), the only way to get "what's left" in the stream without
losing what has already been buffered is to use the
XmlTextReader.GetRemainder() method. Guess what, after calling that
method, you have effectively screwed your "main" reader. And as the
XmlTextReader doesn't support ICloneable either, you can't even
store/clone/keep its internal state before screwing it. I heard someone
suggesting that one *hack* would be to store the element qname and depth,
construct a new reader and read again until its met again. This is clearly
an unnacceptable hack: we would be parsing multiple times the same thing,
wasting processing time by reading useless nodes, etc.
What's the moral of the story: cleaner separation allows for novel uses not
foreseen originally. Violations lend in the best case to ugly hacks, in the
worst case (as in the XmlTextReader) to plain impossibility. Let's keep
dreaming about the ResetableReader (or thinking about alternative XML parsers
for .NET...).
Note: this entry has moved.
Longhorn doesn't work in current VMWare Workstation product (4.0). However, you can use the latest beta (4.5 RC1) which
adds (experimental) support for Longhorn under a Windows 2003 Server VM. I've also downloaded it to fix a corruption I got on my VM disks after running Partition Magic, and it worked flawlessly.
Download it from
this location using the following registration information:
UserName: workstation
Password: experimental
(information taken from
here - not that I know japanese, but I figured it out ;))
Your current VMWare licence key will work. Enjoy!
Note: this entry has moved.
All my programming life was tied to VB. I started with VB3, and finally became a master in VB6, where I was able to do ANYTHING the language would let me. Of course, there were MANY things that I couldn't do, and OOP and design patters were soooo cool that I really needed to get my hands dirty by doing real programming based on them, not just "
bathreading". So I made inroads in Delphi and Java for some time.
Then came .NET, and MS gave me a new toy to spend my days (and many nights too) with. VB.NET and C# both provide extensive support for OO programming. Even when I still code and write books (see
Amazon) in both languages, I prefer C#, because I find it cleaner and less convoluted. I believe over time, the mix of old VB keywords/syntax and new .NET constructs such as generics, is turning VB.NET into one of the ugliest languages EVER.
For example, I see (from the excelent article on MSDN) the VB format to construct generic types. It simply sucks:
Dim stack As Stack(Of Integer)
I assume if the constructor has parameters, those will go after the type specifier?! Compare that with the elegancy of C# 2.0
Stack<int> stack = new Stack<int>();
Constructor parameters go where you expect them to go, the type specifier is separated from the constructor call. It is simply perfect. For the VB version, I'd like it to be:
Dim stack As Stack<int><br>stack = new Stack<int>();<br><br>'Or<br>Dim stack as New Stack<int>()<br>'maybe <br>Dim stack as New Stack[int]()<br>
Maybe the VB.NET team should find an
Anders Hejlsberg for their design process...
Update: from the discussion with one of the VB language designers, where he praises YAVBK (Yet Another VB Keyword), I can only say "WTF?!". They're adding an IsNot operand?!?!?:!!?!?!?! From the example justifying it:
(instead of this):
If Not x Is Nothing Then Console.WriteLine(”Has a value.”)
(you will be able to write this):
If x IsNot Nothing Then Console.WriteLine(”Has a value.”)
I wonder why on earth do VBers write code like that?! Look at the following equivalent (more readable) code:
If x <> Nothing Then Console.WriteLine(”Has a value.”)<br>'Or<br>If x = Nothing Then Console.WriteLine("Doesn't have a value")
It's FAR more readable and understandable than using that awful Is/IsNot test. It boils down to whether you want to teach VBers how to write good/maintanable/readable code or just give them new keywords to keep doing otherwise, but with less code.
Note: this entry has moved.
I can't really believe
this is true. Was it REALLY a joke in the beginning?
Note: this entry has moved.
In a
previous post I showed and discussed the similarities between the W3C XML Schema type system and the CLR one. Dare
commented on it by mentioning a number of already known (at least by me) issues with WXS->CLR mappings, specially the fact that the later supports only a subset of the former.
Given the overwhelming response in favor of similarities against differences (
1013 to 0 so far), I can only say that Dare is probably ignoring that most developers are
.NET DEVELOPERS, NOT XML theorists and WXS fans. Therefore, most of them completely ignore or plainly don't care about the intricacies of WXS he's talking about. My question was about the features developers really use from WXS, and the answers I got speak for themselves.
So, there's no tautological question as he argues. I can rephrase my question as follows: “If you ignore the parts that are irrelevant/impractical (such as no support from XmlSerializer)/overly-complex-to-be-of-any-use/only-for-WXS-fans/Ph.D-only-material, do the CLR and XSD type system fit well together?”. If I ask the people to vote again, I'm willing to bet whatever I have that I will get the same answer.
That's why
my weblog is titled "IXml* -
Welcome to the real world". Not only because I'm a big fan of
Matrix but because I care about what happens in the daily work with XML.
Note: this entry has moved.
I've had some discussions with co-workers and colleages about the WXS (W3C
XML Schema) type system and its relation with the CLR one. We all
agree that many concepts in WXS don't map to anything
existing in OO languages, such as derivation by restriction,
content-ordering (i.e. sequence vs choice), etc. However, in the light of the
tools the .NET Framework makes available to map XML to objects,
we usually have to analyze WXS (used to define the structure of that
very XML instance to be mapped) and its relation with our classes.
When you use the XmlSerializer to get a CLR object filled with data in
the XML, you're actually mapping it to the CLR type system. Moreover, when you
use xsd.exe /classes tool, you're effectively translating WXS types to CLR
ones. You get classes with System.String type corresponding to xs:string,
and the like. Dare explains this in
his article in MSDN. The .NET Framework documentation about the
XmlSerializer class explicitly states:
To transfer data between
objects and XML requires a mapping from the programming language constructs to
XML Schema and vice versa. The XmlSerializer, and related tools like Xsd.exe,
provide the bridge between these two technologies at both design time and run
time.
Even the
XmlValidatingReader.ReadTypedValue performs this map transparently for
simple types, which is thoroughly documented in the product documentation under
the title
Data Type Support between XML Schema (XSD) Types and .NET Framework Types.
At the PDC, new and even more comprehensive mapping tools/approaches were shown.
But let's go beyond the simpleType (almost) natural mapping
between WXS and the CLR. We can have an abstract complexType in WXS named
Person, and derive by extension Employee and Customer ones. Our root
element, which will be a list of the contacts we know about, can be a choice of
any of them, like so:
Now, what do you think such types would look in .NET world? Well, I
don't think it takes an expert in WXS to realize that these types map nicely
with an abstract Person class, and Employee and Customer derived types. We can
confirm that by running xsd.exe /classes with this schema, and we
will get the following .NET classes:
///
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://aspnet2.com/xsdvsclr")]
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://aspnet2.com/xsdvsclr", IsNullable=false)]
public class Contacts
{
///
[System.Xml.Serialization.XmlElementAttribute("Employee", typeof(Employee))]
[System.Xml.Serialization.XmlElementAttribute("Customer", typeof(Customer))]
public Person[] Items;
}
///
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://aspnet2.com/xsdvsclr")]
[System.Xml.Serialization.XmlIncludeAttribute(typeof(Customer))]
[System.Xml.Serialization.XmlIncludeAttribute(typeof(Employee))]
public abstract class Person
{
///
public string FirstName;
///
public string LastName;
}
///
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://aspnet2.com/xsdvsclr")]
public class Employee : Person
{
///
public string EmployeeID;
}
///
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://aspnet2.com/xsdvsclr")]
public class Customer : Person
{
///
public string CustomerID;
}
Note that the XSD tool was even smart enough to realize that as both
expected elements in the WXS choice for the Contact element inherit
from the same Person type, they can actually be part of the same array
type, which is defined as Person[] Items. That looks like a pretty
nice fit.
In this light, I'm conducting a survey about developer's view on the
relation of the XSD type system and the .NET one. Ignoring some of the
more advanced (I could add cumbersome and confusing) features of WXS, would you
say that both type systems fit nicely with each other?
Valid votes (through comments, will be summed up in this post description) are:
YES (they fit nicely) and NO (they don't).
The later sort of implies that you think MS is pushing the similarities too far,
and that it's not good. I look forward your comments and votes!
Current votation: YES=13, NO=0
Note: this entry has moved.
After knowing that a
google search for 'miserable failure' returns
Bush bio (this was even
an article in the BBC news) I tried my (
invented) term
'bathreader' (an activity I believe should be legalized), and guess what: I'm the the only one returned :D (together with
VGA that commented on my new alias :|).
So my new weblog subtitle is:
Daniel Cazzulino (a.k.a. "kzu" and "avid bathreader") 's .NET and XML digress. Self google-bombed :o).
Note: this entry has moved.
Another
really unfortunate news coming from MS. Patents for accessing XML documents generated by Word?! Come on!!!!
Note: this entry has moved.
Finally,
Hernan de Lahitte started blogging. He will touch architecture and in-depth details of Shadowfax, as he is one of the main architects and Senior Developer. For those that don't know what Shadowfax is, it's (IMO, exclusively) the Indigo for .NET v.1.x (v2 too?). He's a security paranaoid guy, I must say.... i.e. he has the firewall enabled inside the corp. LAN!! (no way to get any music shared from him :o))
More Posts
Next page »