I was updating some slides last night on XPath and XSLT. There were a few things that didn't go as smoothly as I would have liked the last time I used them. But that's just the way of the world.
Unfortunately, each of the XML specifications sat down and read come across as a rats nest of intertwined references and formalized definitions that generally impede readability, understanding and the general health of forests everywhere. (I should note than when I was reading Michael Kay's 1ed of XSLT from Wrox, I looked wistfully over at K&R, asking why there wasn't a good, thin, concise reference for XSLT. Perhaps that problem has been solved by now...)
Today, I started reading the lastest salvo from the W3C: about 700 pages of new working drafts on XSLT 2.0, XPath 2.0 and the related interference from XML Query. XSLT 1.0 was difficult to fathom within the first five reads, and XSLT 2.0 looks worse. I'd like to know exactly why there needs to be an entire section that very verbosely states "namespace processing works as expected in the source and result trees". Catching up on the XML Query discussion on xml-dev today, it sounds like it is more bureaucratic and hopeless than XML Schema. I honestly didn't think that was possible....
It got me thinking. Perhaps it's time to take back XML, starting with a refactoring of the core specifications until they are a coherent whole. Here's a rough cut:
From here, CML, SVG, XSL-FO are simply vocabularies to learn. The above list describe the basic semantic behaviors and processing expectations for XML documents. The key goal here is that we start with a solid simple foundation and build upon it clearly. When namespaces come up in XPath or XSLT, they point back directly to the namespace discussion, rather than rehashing it formally and verbosely.
There might be a reason to start with a level zero, introducing the need and driving factors for doing all of this work....
Re:Corp. Inc.
ziggy on 2002-01-04T19:20:35
Not quite yet. They're only 99.44% bad juju. But they're still the only way you can do this:1. Dump DTDs altogether. They are a nasty remnant of SGML and deserve to die.<!DOCTYPE foo[
<ENTITY somedoc SYSTEM "somedoc.xml">
<ATTLIST foo id ID>
]>
<foo id="1">
&somedoc;
</foo>
Until there's a core replacement for those two features, then DTDs have a modicum of utility. No, I don't believe in XLink.
Reversing the Char production is a given. The way I laid out Basic/Namespaces/DTDs+Valid was intentional; until such time as XML Processors are required to support namespaces, they're technically optional. However, the place to add namespaces is after the basic grammar, not after the basic grammar + DTDs.
My expectation would be that subsequent rewritten specs (XPath, XSLT) would simply subsume the Basic+Namespaces as a single entity, not XML 2ed with a liberal sprinkling of Namespace Sugar. (Also, the namespaces spec should be extended to address QNames as attribute values, unfortunately)
Re:Corp. Inc.
Matts on 2002-01-05T08:51:04
Xinclude covers external parsed entity inclusion. ID values would have to be covered by some schema system. I'd be in favour of using something like a top level <?xml-scheme type="..." href="..."?> for that, similar to XSLT.
In addition to which I think we can clearly separate the schema layer using 1) a way to access the tree in a validation-friendly fashion (this is probably 98% there already), 2) a standard way to link a doc to a schema (either with a specific XLink, or if it really has to be done with a PI as well) and 3) have an API to decorate (mostly for typing) trees so that schema languages that can do typing can use any old DOM/SAX supporting that extension.
So what's the next step ? I don't know for sure, but I guess it could be about listening to the remaining parts of the W3 that still make sense (there are some here and there) and turn more to OASIS, which seems to make a hell of a lot more sense these days (especially with RelaxNG which imho beats the hell out of XSD any day). I guess we'll see