I spent the better part of the afternoon optimising XML::SAX::Expat and XML::NamespaceSupport, with the help of Barrie and Matt. It's fun! XML::SAX::Expat is now 60% slower than XML::LibXML::SAX::Parser, instead of 80%, which means a 25% speedup.
This was triggered by a Lincoln Stein post about the slowness of parsers (he has a typo, when he writes XML::Parse::Expat he means XML::SAX::Expat). Of course, his benchmark is flawed in a few ways, notably that HTML::Parser doesn't parse XML (well it does, but it won't fail on ill-formed documents) and neither it nor XML::Parser were configured to report namespaces.
Nevertheless, the difference would only be decreased slightly, not anihilated. So I decided to see where I could shave off some time. In his given example, XML::SAX::Expat would now run in 128s instead of 144. XML::SAX::PurePerl would probably run faster as well because it also uses XML::NamespaceSupport which has had a serious performance boost.
I'll be releasing these modules when I've tested that they're stable under these new conditions, and integrated some patches from Grant in XNSS.
On another front, Matt has started to work on an XS version of XML::SAX::Expat (it currently is layered over XML::Parser, which is a huge waste). I'm glad he's taken that over because while I've understood how XS works and how to get args, munge them into hash, call a method (which is rather trivial) I find that the XS processor is really braindead and keeps complaining about tons of things without really being very informative about it. It took me all of two hours to understand that it was bitching about a simple comment that wasn't where it wanted to see it... I pretty much left it there, frustrated with so much braindeadness.
We've also been considering XS versions of XML::NamespaceSupport and XML::SAX::Base, that would be built if the user wishes to have them (the Perl versions have to stay). If someone wishes to volunteer, it oughn't be hard and it'd help lots. Otherwise, it'll prolly happen at some point.
Re:XS
darobin on 2002-02-08T00:06:30
Well, you know the song as well as I do: "You can't find your baudster with a comment counter" [1][1']. I do not claim that XS is generally b0rkn, simply that its total lack of intelligence and dwimmery regarding comments is awfully unperlish and frustrating. It almost reminds me of Java strings.
[1] extract from The AxKit Has Been Drinking, based on Tom Waits' The Piano Has Been Drinking. The original verse reads You can't find the waitress with a geiger counter.
[1'] for those not of the Higher Circle, Matt is known as baud on #axkit(-dahut).Re:XS
Matts on 2002-02-08T00:09:36
XS rocks at comments!
Put in// - it figures it out.
Put in/* ... */ - it figures it out.
Even put in #... - it figures it out.
OK, well almost. Put in # if we do this we're screwed, well you get what you pay for;-)
The answer? Always start your comments with some hash.Re:XS
darobin on 2002-02-08T00:17:13
If you were so hard on hash(es) as you pretend to be, Expat.xs would be flying already. I know that just as about everyone else you hate my "banner" comment style, but that fuxored XS really bad. And in fact, if it had told me "Error: your comments suck" I would have unhappily obliged, but instead it chose to display VeryCrypticErrors...
So, even admitting that XS rocks at comments (which it does.... NOT!), it certainly could use a hand for error messages...