Why Perl programmers haven't embraced XML

pemungkah on 2003-02-05T15:27:19

Because you end up embedding too much information about the structure of your XML data in your program. And of course the XML is subject to change, especially if someone else is generating it.

And Perl programmers hate external dependencies that are stuck in code.


Perl and XML

ziggy on 2003-02-05T15:57:59

Your assertion does not accurately summarize my experiences with Perl and XML.

First, lots of Perl programmers have embraced XML. There was a period of time when the only module for parsing xml was XML::Parser and a few half-finished attempts at doing something differently. Today, there are many polished alternatives for processing XML, including the interchangeable PerlSAX framework which mimics SAX in Java. In fact, some ideas crop up first in Perl (or rather in Barrie Slaymaker's head) before they are proven and reimplemented elsewhere.

Second, there's the burning question: what problem is XML trying to solve? One thing that XML has done is replace a bazillion and one one-off file formats and replaced them with a single easy-to-parse framework for creating new formats. Perl handily munged one-off file formats before XML (especially text-based ones), so Perl programmers have been and still are less inclined to whip up some random XML to solve a problem.

Third, I said a few years ago that the areas where XML is being heavily adopted are also the areas where Perl is not heavily used. That has changed somewhat since 1999 or so, but it is still largely true. It's a pain to screen scrape an HTML page with Perl, but it's more of a pain to do it in Java. That's one reason why there's more of a need to adopt XML-RPC and SOAP with Java (where' it's easier to generate the stubs and descriptions) than in Perl (where WSDL is more difficult to generate).

Fourth, Perl is less hyped than other languages and environments. People who migrate to Perl are generally less inclined to use something because it is fashionable, and tend to actively choose something because it works and solves a burning problem. A lot of XML vocabularies are really, really bad on so many levels. It's obvious to an XML adept that many vocabularies (like Mac OS X's Property List format) are in XML only to be fashionable; little to no thought was put into how the vocabulary would be actually used. I find myself still coming up with simple one-off text formats in Perl because they work better than some random poorly-designed XML vocabulary du jour.

Re:Perl and XML

barries on 2003-02-05T17:30:50

It's a pain to screen scrape an HTML page with Perl, but it's more of a pain to do it in Java.

Matt Sergeant, AxKit's father, cooked up a neat approach to this: use libxml2 (via XML::LibXML) to parse the HTML in html and recover modes, then apply normal XML tools to it. I've not tried it, but I'd like for you to be able to do that and use XML::Filter::Disparcher to pluck out strings from the resulting XML stream using rules like:

    'string( foo/p )' => sub { print "foo/p contains '", xvalue, "'\n" },

Anyone that wants to try this, I'll help; it's a neat use case.

I agree wholeheartedly that XML is being badly applied to many things (as in your bad grammers comment), and that it's also being misapplied to things where there are more appropriate technologies. I'm no fan of BXXP/BEEP or SOAP, for instance. (I may yet change my mind on BEEP, if the toolset supporting it makes it less impenetrable).

- Barrie

P.S. <blush/>. In reality, most of the ideas that crop up in my head have been disproven loooong ago. I rediscover the obvious, daily. It's like having intellectual altzheimer's, I meet same concepts anew each day.

P.P.S. Anyone interested in Perl+XML should definitely check out Kip Hampton's Perl and XML articles on xml.com. They range from the sublime to the sophisticated.

Re:Perl and XML

ziggy on 2003-02-05T23:59:10

Matt Sergeant, AxKit's father, cooked up a neat approach to this: use libxml2 (via XML::LibXML) to parse the HTML in html and recover modes, then apply normal XML tools to it.
Matt's mentioned this on more than one occasion. I always thought that libxslt/xsltproc was "broken" in its support for parsing HTML. I don't know how I came to that conclusion, but it must have been based on an early release of libxslt.

Anyway, later that day, on Matt's urging, I wrote a quick little XSLT stylesheet to grep out the important bits of a document and massaged it with xsltproc. Sure enough, it worked exactly like it was supposed to, exactly how it was documented. (I can't believe I held off on that for so very long...)

I forget what project that was, or where the code is, or what exactly I was munging at the time. I do remember that I iteratively developed the stylesheet to emit a simple text format (a bunch of lines or something). The last step was embedding the stylesheet in the __DATA__ section of a Perl script and gluing/automating the process with some Perly bits.

In a bizarre kind of way, it was sort of fun!

Re:Perl and XML

pemungkah on 2003-02-12T20:42:25

I agree with your points in general. Maybe it's just me (chorus: IT"S JUST YOU!) but I find XML painful to work with, generally. Most of the XML "solutions" don't feel, well, Perl-y to me.

I have come up with something that does - it's reasonably fast, rapid to code, and it works, plus it keeps most of the dependencies on what the XML is like outside the program itself: Text::Template to generate XML I need to give someone else, and XML::XPath to extract data from XML they've given to me. I was thinking about proposing an OSCON session on "outside-the-box XML programming in Perl" or something like that; dunno if anyone would be interested, though.