I'm exhausted.
I've spent the last few days throwing myself headlong into porting AxKit's coolio presentation tool AxPoint to a SAX handler. Those who read my journal regularly know why - all part of the SAX world domination project.
But not just that. Prior to now the only way I had of doing AxPoint slides was to load them into an AxKit server, usually at home, and download them. This is a pain when you just need something done fast. I did originally have an offline version of AxPoint, but I started adding features to AxPoint only in the AxKit version, so I had presentations that used features not available in my offline version.
Not only that, but the offline version used XML::XPath. This makes certain things really easy, for example to loop through all the slides I just do: for ($root->findnodes('slide')) { process_slide($_) }. Simple. But it also encouraged me to do the same for slide bullet points. Not a problem you'd think, but it makes it really hard to do things like add in support for bold, italics, and so on into the bullet points. With SAX that sort of thing is easier, because you just see a stream of tags, so when I see <b> I can just push the current font onto a stack, turn on "bold", and continue processing. Then I pop the font stack when I see </b>.
Well I thought it would be easier than it was. For a start, I had to manually write wrapping code. This actually was the easy part. The hard part with SAX is making sure you balance everything at the right point. For example I had to create the bookmarks (see the link above for examples - but you need Acrobat Reader not xpdf to see the bookmarks) when I see the </title> tag, because then I have the full title text for the bookmark. But the bookmarks require a stack, and you have to pop that stack on </slideset> or </slide>. And if you get it wrong your bookmarks look screwy.
Not only that, but AxPoint has a neatly hacked bullet point transition system. It basically works like this: if the parser sees a bullet point with a transition on, it recreates the entire page up to that point and including that point, with the transition on the actual page flip. So if you have a 20 slide presentation you may actually end up with 200 pages in your PDF. The nice thing is the bookmarks remain right, so you don't notice this unless you actually look at the page count.
That sort of "looping" thing is easy to do in a DOM processor like XML::XPath, but hard to do in SAX. What I ended up doing was caching a list of events for an entire slide, and then replaying them. Some judicious use of local $self->{cache}, and it worked beautifully (well, there were certainly a lot of moments when I was terribly confused, but it worked eventually).
The other tough thing was figuring out how to do centre alignment. As I said in an earlier journal, I was using PDFLib's string_width function. The problem is you have to split the text into words, work out the width, and add that width to the current width. If you exceed the max_width, you have to move to the next line. However it's not quite that simple. It's easy enough to do the calculation, but you also need to output the stuff. This is where closures come in. This allows you to basically push a closure onto a stack that outputs the string you've checked the width of. Then when you finish processing, or have to move onto the next line, you divide the width of the previous line by two, move to the middle minus that value, set the text position, then run your closures. Magically it all outputs just right. Perl++. This simply wouldn't be possible without closures - I'd have to store *way* more information in the stack.
Anyway, I finally also got the colour, bold, italics stuff working too, and it's uploaded to CPAN with a nonesense example file in testfiles/ that people can look at and play with. Next time around I'll probably add some sort of command line processor to it.
If you get to try this, I'd love to get some feedback on it.