In light of the recent anti-XML discussions here on useperl, here is an observation by Edd Dumbhill:
XML's benefit isn't that it makes everything human readable (can you easily understand a W3C XML Schema description or an XSL-FO document merely from visual inspection?). No, the benefit is that it makes it readable at all. The barrier to entry is lowered: you can go further without tool support. But it's a myth to think that you can go the whole way. To consume any XML vocabulary of more than trivial complexity requires some degree of tool support.Yep. You can do more with XML than you can with, say, SWF, because XML is a text based format. Worst case, you can munge an XML file with a Perl script and bang it into shape with some regexes. Try and do that with Word documents, Powerpoint presentations or Excel files.
That's the long term benefit that XML has brought us. It's unfortunate that the myth of XML being easy to read still persists.
I'm not opposed to XML. I'm opposed to it being used for configuration files without tool support provided.
Re:Not Anti-XML
ziggy on 2003-08-08T19:28:48
Noted. I overstated the case. Sorry 'bout that.I've got a few neurons miswired to read complaints about XML abuse as complaints about XML rather than complaints about the idiotic developer[1] who misused XML in the first place.
[1] Yep. I'm guilty as charged.
:-)
That's the long term benefit that XML has brought us
No, using plain text for data exchange has been the Unix philosophy for thirty years. XML didn't bring us that at all.
XML just made that plain text easier for machines to parse and harder for humans to parse, a retrograde step if ever I saw one.
Re:Ignoring our roots...
jordan on 2003-08-08T21:56:26
- XML just made that plain text easier for machines to parse and harder for humans to parse, a retrograde step if ever I saw one.
XML is NOT easier for machines to parse than simple delimited text. Even if you add a quoting convention to allow for the delimiter character to be included in data, XML is much more difficult to parse than simpler text based formats.
A format being plain text is not necessarily easy to parse for a human, either. Plain text formats are nearly as difficult for humans to parse as binary formats when the records get very long.
Also, I don't see how XML isn't, in fact, a subset of plain text formats anyway, so I'm not sure you can say that plain text formats are easier to parse than XML.
What XML provides is a mechanism for defining structured documents. This has the immediate benefit of allowing for the data interchange format to be extended without obsoleting programs that used the old format. This actually extends to the humans who might parse an XML document, also. Additional fields could be added to an XML document format, but this would not obsolete a human's understanding of the document's fields with which they are familiar.
While XML may be somewhat more difficult to parse for a human viewing the raw data, it's generally easier for a human to parse when presented in a structured way, as a typical Web browser does, for example.
XHTML, an XML format, may be difficult for a human to parse raw, but, if done correctly, it is easier to parse than plain text for a human reader when presented by a browser. It's a common tasking to render some plain text document as (X)HTML to enhance readability.
So, even if your point about this being a retrograde step were true, it would have overcome long ago when browsers became widely available that presented XML documents in a very readable format.
Re:Ignoring our roots...
ziggy on 2003-08-08T21:58:00
True. 80% of the benefit from SGML and XML comes form the decades old Unix philosophy. We all accept SMTP, RFC822 formatting, MIME, all the config files underNo, using plain text for data exchange has been the Unix philosophy for thirty years. XML didn't bring us that at all./etc, and the like. But the remaining 20% isn't without merit. How can you claim the same kinds of advantages to all of the blather that ends up in
/var/log? That information starts out as meaningful and structured and often ends up neither. How many programs misinterpreted the content of a log file because of a malformed regex? How many programs failed to capture some important state from some part of a Unix system because of a poorly written "parser"? Don't even get me started on the quirks to config file syntaxes... XML has benefits. It's not nearly as retrograde as you imagine.