CAM::PDF v1.50: Better late than never

ChrisDolan on 2008-09-24T02:50:40

Back in PDF v1.5 (which corresponds to Acrobat 6, in 2003), Adobe added a new feature where nearly all of the document metadata could be serialized in compressed blocks. It was the first completely incompatible feature that Adobe added to the document format since PDF v1.0, so adoption was slow even though it can save about 20-30% of the document size.

Despite reading large swaths of the PDF v1.5 spec and fielding questions from about a hundred CAM::PDF users over the years, I never heard about this feature. I overlooked it in the 952-page spec and never came across such a PDF in the wild...




...Until a month ago that is. Suddenly, people were emailing me left and right about support for this feature. I'm not sure what changed. Someone important (maybe a recent Acrobat release?) must have changed a default so new docs use the compressed syntax.

Now CAM::PDF v1.50 supports reading compressed streams. It still only supports writing the older PDF v1.4 style streams, so as a side effect it's a useful tool for downgrading your PDFs for broader compatibility. Along the way I fixed a serious bug in the PNG decompressor in my code. Wow, I can't believe nobody hit that one before.

It works very well (pretty good unit tests) but just, uh, don't look too close at the source code. I took some complex, 2002-era, barely-object-oriented code and added another layer of complexity on it. Man, if I had the time to refactor this, I would try to merge CAM::PDF's rich low-level feature set and speed with PDF::API2's saner API and the Perl PDF world would be much happier. Maybe for Rakudo 1.0...