XML 2.0

ziggy on 2001-11-01T01:28:20

This week, it's back in Harrisburg to teach the second half of the XML class I started two weeks ago.

XML is beginning to feel like that lawn mower in the garage. Still quite useful, but it requires a lot of effort to use. Powerful in some ways, but still quirky.

For example, I'm in the middle of teaching a module about XML Schema at the moment. This is after going in depth on DTD syntax. Sure, XML Schema addresses all of the current uses of XML, but it's way overengineered. It's not that the usefulness of XML Schema isn't apparent -- I can look at all of syntax in XML Schema and see where it is useful and why it is there. But it's TMTOWTDI gone awry.

Then there are the idiosyncracies between all of the different, independantly developed specifications. CSS, XSLT and XML Schema each define an import behavior, but the import behavior is slightly different with each one. XPath is a universal syntax for identifying patterns in an XML document, but it's not used to identify CSS patterns in CSS (since it predates CSS). Then there are the differences between DTDs and XML Schema; theoretically, both can be used simultaneously to determine if a document is valid.

Add all this up, and it begins to look like XML 2.0 is an eventuality. Start with all of the standard, best-of-class standards related to XML, and harmonize them. One import behavior for any specification with importable parts. One syntax for specifying patterns within a document (XPath). One way to validate a document (not DTDs, and hopefully better than XML Schema). One abstract model for describing XML documents (Infoset, not an arbitrary subset of Infoset used in XPath).

I'm all for TMTOWTDI, but with Perl, it works because there's an overarching goal of matching the language to the way people think. With XML, each different way to do something reflects the membership, goals and backgrounds of the committee creating a fragment of the beast we know as XML. Definately not a good case of TMTOWTDI.


100 % agreed

Matts on 2001-11-02T16:25:01

It's a bit of a nightmare isn't it? The W3C XML Schemas spec is pure evidence of every company in the consortium putting their oar in, and poor old Henry Thompson having to try and sort through the mess (and then subsequently being the scapegoat for all the flak!).

Ah well, at least the Infoset now has a seal of approval, which should see XPath 2.0, DOM and Schemas having a unified view of the XML world.

Of course, like many things, I recommend not using a lot of it. DTDs themselves are a mess (see my journal), so I tend to stick with elements and data and comments in my XML stuff. If I need validation I really like Kip Hampton's XML::Schematron. Also note that Andy Wardley has an XML::Schema module, but he is slooooooow at releasing stuff to CPAN. I had a copy on my laptop but the hard drive died. Looked pretty good though.

Re:100 % agreed

ziggy on 2001-11-03T15:06:01

There are a few things I like about XML Schema. But even with the parts I like, I don't like how the Schema spec goes about them.

For example, it's nice to validate element content, and simpleType are a reasonable way to do that. The facet approach is a practical and elegant way to get the job done. But why the need for a distinction between simpleType and simpleContent?

Henry's a great guy. He's done good work in the past, and will likely continue to do so in the future. But this was an untenable situation, and I don't know if I trust Henry's skills as a language designer. Then again, I don't know if Larry would have been able to any better given the corporate nature of this spec.