What do I want? OWL for the Masses, XQuery and a Diet!
Since Mike Champion asks what we want to see in future versions of XML, I thought I'd give him my list. It's my opinion that XML's greatest strength -- amazing adaptability -- is also its biggest weakness -- too few rules. Today, we're getting more benefit from XML's minimalism than its costing us to cope with, but I think we're already starting to see it run into some pretty hard to solve limitations because of it.
Examples? Okay, how about Tim Bray's assessment about XML's as an impedance in many ways to the development of rich semantic systems, like Knowledge Portals. In a simple human-to-human conversation, its reasonably easy and common to have a shared, state-based understanding of what a term (like an element name) means relative to the problem domain. Although schema by example serves a great role in the format description and validation of stateless conversations between non-humans, it is just not getting the job done as a way for applications to share problem domain definitions and semantics blindly. Sure, technologies like OWL try to close that gap using XML and XML schema, but that leads me to another problem...
In order to really do anything with XML, I have to learn more than just XML. For a minimal as XML is, these tooling technologies are frequently maximal approaches to a problem. Seemingly to compensate to boot! Consider what should be a simple task like converting XML in one schema to XML in another schema. For a procedural language, that might be complex, but it is not hard to figure out how to do efficiently and effectively. Instead what we got is XSLT, an imperative language that's hard to learn but produces wonderfully simple, very efficient and darned effective solutions. Once you climb its learning curve. XSD is another example IMHO. Really powerful stuff, but mind-numbingly complex until you really grok its Tao.
So what do I really want here? OWL for the Masses! Or maybe just OWL Lite Lite. Make it easier to share both definition and semantics, but make that it as easy to learn as XML (and just XML) is. And make it a formal part of XML, not just something else to learn. Accomplish that and I think you create the potential for a quantum level of improvement in XML.
Secondly, get XQuery done and move on standardizing change/update/delete semantics and methods with it. We've got two standard (DOM, Streaming Push) and one "better than standard" (streaming pull) APIs to read XML and exposing it for processing. We've also got a great way to navigate it with XPath over loaded DOMs and streams, but modification today -- in a standards-only API approach -- means DOM bloat. Bad! XQuery (at least as some folks are talking version 2.0 of it) at least holds the promise of changing and streaming. We need that. Get it done.
Finally, we need some sort of standards-based and enabled way to put XML on diet. Optional tokenization/compression seems like a pragmatic way to deal with the verbose nature of XML over the wire. Pick some set of open standard methods that essentially any parser worth its weight could do and recommend their implementation. I don't think there's much to be gained from compressing the prolog, but there's much to had from doing it on down.