The perils of efficiency

tmtm on 2001-12-09T22:10:57

Today I spent a long time banging my head off Class::DBI, and managed to chop another 200 or so lines out of it, in the never-ending clean-up quest.

Some of it was simple, like discovering that its 'rollback' method looked for the values you'd changed, and reloaded those from the database. So instead I made it just lose all memory of those, and let its normal 'lazy loading' mechanism fetch them the next time you asked for them.

Other bits were hairier, but deeply satisfying, like removing its support for pseudo-hashes. I don't think anyone actually ever tried to use this, and if they did they're much too sick. Getting rid of this also allowed me to lose two modules from its multiple-inheritance tree. (Not that it ever actually needed them there, due to their (accidental?) mix-in behaviour.)

But far and away the scariest bit was untangling a twisted maze of parallel class data structures. When you set up a Class::DBI class you tell it information about your database table: not just how to connect to it, but what the columns are, and how to group them. Then, when you call a method corresponding to one of the columns it works out what column groups it's in, and what other columns are in those groups, and fetches all those at the same time, on the assumption that if you've grouped them together well, you'll probably want to get these values soon anyway, and we may as well save a trip to the database.

So, for efficiency, we had two different data structures - one mapping columns to groups and one mapping groups to columns. In class data. With a lot of little support methods to look after all the fancy cases.

But, like most things that get implemented for efficiency's sake, it was fast becoming unmaintainable. So, I ripped it all apart to only store the data once, and just look it up both ways around.

The result? Code that's much easier to maintain and extend, and a performance hit that seems to be less than 0.3%.

And having done all that, I think I can see a few ways in which the higher level performance of this can be tweaked, probably resulting in a nice net gain, as well as cleaner code.

Another victory for the Rules of Optimisation.