Comparing XML::Twig and XML::Filter::Dispatcher

toma on 2003-01-13T05:12:25

Comparing Twig and Dispatcher
I rewrote my XML::Twig program to use XML::Filter::Dispatcher in order to compare the approaches. I compared the simplicity of the code necessary to do the job, and the speed of execution.

The result was that XML::Twig ran 17 times faster, which surprised me.

The Dispatcher code was cleaner than the Twig code. This is because I was able to remove the code I wrote to get my Twig return values to come out in the correct order. The order of the data from Dispatcher worked the way that I had orgininally hoped that Twig would work.

The speed is a big deal for me, because the Twig code is actually already slower than I would like it to be. The Dispatcher code is probably not fast enough for my application. I'm tempted to write the code again and use a format other than XML to see how fast it runs.

It would be nice if I had a program that would automatically measure the complexity of a perl program. I would like to be able to compare the complexity of the implementations with a numerical technique.

If anyone wants to see the two approaches and the test data, let me know and I'll post it on tomacorp (We're not a corporation).

New Module Testing
I installed and tried PerlBean, which looks useful for automating the generation of perl objects. Before I use it in a real project, I need to understand if there is a way to use it so that the classes can be redesigned without losing work. The straightforward way looks like you would have to edit the class by hand after the initial run of the module, and if you want to run it again you would have to cut and paste the custom methods in again.

Perhaps there is a way around this. PerlBean would make a good core for a perl IDE, I think.

I sent a bug report to the author of PerlBean. It looks like the tutorial didn't get an update after an API change.

The Need for Speed

barries on 2003-01-13T18:47:39

XML::Filter::Dispatcher is definitely slower than TWIG. It's still young and it will probably never be as optimized as TWIG is for TWIG's purpose.

That being said, you may want to look at the struct() and hash() extension functions that return Perl data structures something like TWIGs. <plug>My shiny, tiny new BFD module might help you see them:

'foo' => [ 'hash()' => sub { use BFD; d xvalue } ],

;)</plug>.

Along with optmization, I'd like to enable X::F::D to generate single-purpose handler or filter classes from rulelists and, optionally, save them to .pm files. In other words, you should be able to ask X::F::D to generate MyTwigger.pm (or MyFooFilter.pm, say). This would be a complete SAX handler (or filter) class that would run lots faster than the interpreter in XML::Filter::Dispatcher.

X::F::D isn't meant to compete with TWIG, it's meant to allow for more intricate SAXual relations between XML and Perl than TWIG is. So it's likely that TWIG will always be faster than X::F::D if you're doing purly TWIGy things. I'm using it when (a) programmer convenience matters more (which is often) and/or (b) when TWIG doesn't do what I want (which is also often, because I'm picky). YMMV.

Once X::F::D is more stable (it's still beta, having only recently graduated from alpha in my mind), I anticipate some significant optimizations, especially in cases where its processing lots of rules. And even more performance should be possible when it allows you to generate those one-off classes and Perl modules from rule lists.

The generation of Perl modules (as opposed to just generating classes in memory) will mean specifying actions as strings instead of CODE refs ("foo()" instead of "sub { foo() }"). Those should rock.

- Barrie

Please do

bart on 2003-01-15T11:16:20

If anyone wants to see the two approaches and the test data, let me know and I'll post it

Yes, please. You see, I've written my own, (as yet) unreleased XML parsing module which, from the little I know of XML::Twig, and judging by the name of the other module you mention, sounds like it's pretty much in the same realm. Before I even contemplate of releasing it, I'd like to port your test script in order to use my own module, and compare code and speed to the two modules you've used.

Benchmark programs are available

toma on 2003-01-16T08:36:31
Performance Comparison Between SAX XML::Filter::Dispatcher and XML::Twig with test data is available.
I'll write more about the testing, including more detail about the results.

Comments

mir on 2003-03-26T11:17:49

I have posted some comments on the article and a faster version of the XML::Twig code on the XML::Twig page.