Mature optimization

TorgoX on 2006-06-23T07:26:15

Dear Log,

At times, I say no.

I take some little document, a little bit of text, and I want to make a PDF. I print to a PostScript file and then, before I run ps2pdf, I note that the weensy text document has become a massive PostScript document (on the way to being an not-quite-so-appalling PDF). I grit my teeth and ignore the absurd suboptimality of it all.

A while ago, I was fiddling around with the Chimes of the Clock of the Long Now. This is the bell chimes to be rung out by some musical clockwork designed by Danny Hillis and Brian Eno; it has ten bells (or will, once it's all put together), and when the clock chimes, daily, it rings the bells in a different order, in an order that occurs once in the 10,000 years that the clock is supposed to last.

After fiddling around with audio-format renderings of these chimes (as generated by my Perl implementation of the permutator algorithm) and thought, for kicks, that I could try generating sheet music. So I poured a month's worth of the note-values into a nice little music-typesetter program...

And it spits out a fifty KB file.

I twitch. I say THIS CANNOT STAND. The PS file contains 23KB of PostScript routines, that's fine. But just this once, it simply makes me batty that it takes 25KB of code to draw every page, which contains really only about 300 characters worth of information, which is actually the output of an algorithm that can be completely implemented in less than that much code.

I say just this once, I will fix this, I will make it not horrible.

I decided to approach it as an exercise in extreeeeeme code maintenance.

I toss out the never-unused routines in the PostScript prolog. I eliminate redundant parameter-passing. I even find and fix a stack leak! I zig, and zag, and zig again. I get it so that you can typeset a whole page by just passing a list-of-lists of the notes in the measures in each month. And then I implement Danny Hillis's permutator algorithm, and write tests for it, and an assertion test, and have that call the typesetter routine, so you can typeset a whole month's page by just calling year month P and that is not the kinda compressy goodness you'd get from just gzipping your PostScript.

Then typesetting all 10,000 years' worth of the music (well, throwing in an extra millennium for fun) is just a matter of calling:

2000 1 12999 {
   1 1 12     {
               1 index exch P
              } for
             }  for

And the resulting document is 27KB.

Therefore I win the Internet.

But it turns out that some PostScript viewers (GhostView, etc) aren't so thrilled with a tight loop that queues 132,000 pages at once. So I go back in and add some DSC stuff (friendlier for random-access navigation of PS documents), altho that means generating endless lines like this:

%%Page: 04011-01 133
4011 1 P
%%Page: 04011-02 134
4011 2 P
%%Page: 04011-03 135
4011 3 P
%%Page: 04011-04 136
4011 4 P
%%Page: 04011-05 137
4011 5 P
%%Page: 04011-06 138
4011 6 P

But hey, about thirty bytes a page is... acceptable. Especially compared to PDF having an overhead of about a thousand times that.

I am now pleased.

This victorious optimization should hold me for a while.

~ ~ ~

A historical note: my introduction to PostScript, sane, compact good PostScript, was reading the output of Gisle Aas's HTML::FormatPS, and then a bit of looking at the output of tek2ps.

That was well after all the good books about PostScript had been written and fallen out of print, but sadly before they appeared online.