In Defense of Version Locking

brian_d_foy on 2007-09-18T10:16:00

Alias writes "Reading through Schwern's journal entry on why it's bad to have strict version-specific dependency locking (which I agree with entirely in the case of CPAN dependencies) I thought it might be prudent to mention a couple of cases where the opposite is true, because sometimes version locking is really useful.

Perl modules work as a series of separate resources, that are not intrinsically aware of each other.

Any perl instance is going to load a bunch of these modules into memory, and there's really no guarantee that it is going to get the right ones.

So while you probably want loosely coupled minimum-version-or-greater dependencies between distributions, you generally want the opposite within a single distribution.

You WANT to know that the 10 classes in your distribution are all from the same version, and there isn't accidentally 11 classes loaded (think deprecated plugins or class-renaming).

Loading different bits of multiple version of the same distribution very often exposes problems due to the many and varied implicit interdependencies that exist between parts of the same distribution as it evolves over time.

It's partly for these reasons that I've been suggesting for a long time that people always release distributions where the versions for every module match.

Maintenance of the version numbers can be minimised by using the ppi_version utility, which uses PPI to locate and safely modify all the versions in your distribution .

(This script coincidently also means it is unnecessary to do ugly $Revision: auto-generated versions in your modules, a practice I hate and which ties the code to a single repository).

Another important use for version-locking is from scripts to modules.

I've had a couple of different occasions, particularly with non-CPAN applications, where a launch script was accidentally running the wrong module versions, typically because of some @INC path issue.

Conversations typically run like this...


Hi Adam

That bug from version 1.234 seems to have reverted, and I'm not sure why.

All the tests pass ok, but when the app was run from cron last night, it failed with the same problem we have back in June, and it looks to be the same cause.

After spending a couple of hours trying to find the problem, it typically turns out that the new version of the launch script was loading in the old version of the module, because the client had installed the original at some point, but was running the new ones from standalone directories, or had hard-coded PERL5_LIB into the cron script of the user profile, etc etc.

A few other odd situations have also bit me from time to time, like the script loading the installed version during "make test" instead of the blib version, but this is essentially the same problem.

The application is not verifying its own integrity and trusting the Perl environment to do that job for it. And like most cases where you trust someone else to deal with your integrity for you, that's a bad idea.

The problem for the developer is not that the error happened (that's the admin/operator/client's fault) but that WHEN it happens, it's entirely non-obvious what the problem is, and that the program ALLOWED itself to load in a broken state.

After the second time this happened to me in a $real project, costing several hundred dollars of wasted time to resolve, I create the only::matching module (based on only.pm), which lets you put a version in your launch script and have an automatic check done that the loaded module matches the version of the caller.

#!/usr/bin/perl
 
use strict;
use vars qw{$VERSION};
BEGIN {
    $VERSION = '1.00';
}
 
# Load our matching module
use only::matching 'Foo';
 
...code...
Since ppi_version will happily update scripts now as well as modules, this version will also be kept up to date as well, and a ton of weird Perl environment edge cases will now be caught immediately and be reported with an appropriate error (saving a bunch of time for the person who made the mistake).

Another really interesting case for version locking, which I'm doing some work on at my current $job, is database schema locking.

Once an application becomes sufficiently large (70+ tables and tons of PL/SQL in this case) it's simply impractical to check the database structure at connect-time, and dangerous to just run the code on any old database you connect to (especially if it is production canonical data and more than say 10gig in size).

So I'm adding a variant of the ActiveRecord::Migrate methodology.

The database gets a table that looks something like this...

create table schema_info (
    version integer not null
    name varchar2(255) null,
);
This table contains a single record, with a version that is the revision of the schema itself, and a name that can be used as a database-independent identifier for the schema (This is needed in our case because the same code runs two different sites, with different database, and we want to be certain that the code on one site isn't accidentally connecting to the opposite database).

In the main database class is a simple $SCHEMA_VERSION variable that contains the schema version that the code is written against, and whenever a new connection is made to the database, the version in that variable is checked against the version in the schema_info table.

In this way, we can be sure that the application is always talking to the "right" database, and so if one developer checks in a schema change, the tests for everyone else don't start exploding at next update just because the code is now issuing SQL queries for columns the other developers don't have yet.

And on production, instead of connecting to the wrong database and doing who knows what before a problem brings down the site, the app will simply outright refuse to connect AT ALL unless it is talking to the correct database.

This is both a MUCH better failure mode, and makes life easy for the operations team, because their error/crash reports will tell them exactly what the root cause of the problem is, which makes life easier for them, and means the downtime is minimised."


Database schema version

ask on 2007-09-18T17:46:17

I do the schema versioning in my projects too.

Example data files:
https://svn.develooper.com/combust/trunk/sql/combust.update
https://svn.develooper.com/projects/ntppool/trunk/sql/ntppool.update

Script to run updates:
https://svn.develooper.com/combust/trunk/bin/database_update

It has a --sql parameter that'll output the SQL instead of running it. Helpful when the production databases are run by a DBA team who don't have anything to do with the servers the code is on.

  - ask

Single versioning

schwern on 2007-09-19T02:15:10

> It's partly for these reasons that I've been suggesting for a long time that
> people always release distributions where the versions for every module match.

Yeah, I'm coming around to that conclusion, too. The possible benefit of knowing specifically which modules in a distribution changed using individual version numbers is outweighed by just knowing what goes with what.

Also it's a pain in the arse to remember to only increment what's changed in the release.