XML::SAX Bundling

darobin on 2001-11-18T19:04:00

XML::SAX, as released recently by Matt Sergeant, is faced with a rather serious problem, causing no end of discussion between the people involved in its creation.

The problem is this: XML::SAX (as available on CPAN) contains several modules:

  • XML::SAX::Base - this is the base class (by Kip Hampton) on which all SAX2 Drivers and Filters must be built. Well, you could create one without it, but honestly you'd be rather stupid.
  • XML::SAX::Exception - this is the root of the SAX exception classes. It doesn't do much, but all thingsSAX2 should normally throw exceptions that are subclasses of this one.
  • XML::SAX::ParserFactory - this is the part that acts as "DBI for SAX", in other words it's JAXP/SAX in Perl. You ask for a SAX parser (optionally that supports some given features) and it gives you the one it thinks is most appropriate (or blows up if it can't).
  • XML::SAX::PurePerl - this is an XML parser written in pure Perl. It is the fallback parser in case no other faster parser can be found.

The problem stems from the fact that all these are shipped together. So that, if the build for onw of them fails, then the build for all fails (from a newbie point of view at least, and it drives us all nuts to have to "force install" on those huge packages doesn't it ?).

We could very well ship Base + Exceptions separately as they form the backbone of everything SAX related. The other two are useful only for XML parsing (ie if you create a SAX driver that reads CSV and fires XML events, they'll be of no use whatsoever to you). Perhaps ParserFactory could be made to be useful outside of pure XML, but it seems unlikely.

But part of the community is tired with people using really bad approaches to XML (eg parsing directly with regexes, using raw XML::Parser, etc) and would like to make the SAX modules so easy to install that people just wouldn't be able to resist. SAX itself is easy, but it needs all the PR it can get. And those people don't like having to install two modules...

So what's the solution that would allow us to make things easy for unexperienced users and pleasant for experienced ones ? Bundles won't work with the former, and even have issues with the latter. The SDK stuff ? How is that moving ahead ? Any suggestions here are definitely welcome.