PAR evilness to make remote auto-upgrading work

tsee on 2009-01-25T22:01:42

When I think about telling people about PAR internals, a reply from a colleague readily comes to mind, when he was asked about an icky detail of his analysis:

You don't want to know how sausages are made!

But then I can't resist grossing out people with some details anyway...

Two years ago, I wrote PAR::Repository::Client as an interface for loading PARs and thus arbitrary modules from a remote server. If the client is installed, all you need to do to auto-load missing modules from the server is:

  use PAR { repository => 'https://foo.com/myapp' };
  use Foo; # will be loaded from remote if necessary

But since this may become expensive, and caching the binaries only removes part of that, the "install" option was part of the interface almost from the start:

  use PAR { repository => 'https://foo.com/myapp', install => 1 };
  use Foo; # will be loaded AND INSTALLED if necessary

Back then, I also added most of the code necessary for an "upgrade" option.

  use PAR { repository => 'https://foo.com/myapp', upgrade => 1 };
  use Foo; # will be loaded AND INSTALLED OR UPGRADED if necessary

Unfortunately, it was missing a few critical details until today. The repository client is normally only invoked when all other sources fail. But that's a problem if you're trying to check for upgrades. Thus, repositories in upgrade-mode are now checked early in the module-loading process.

The real bummer was that in order to check for upgrades, the locally installed version has to be determined. Since this is hard to do reliably without loading the module, that's what PAR has to do. But that means require()ing module X from within an early @INC hook that ran due to a "require X;". There's so many things wrong with that idea, it's not even funny. It seems that creating an infinite recursion in an @INC hook segfaults perl 5.8.9. Regardless, it can be (and was) made to work:

Before running the client's upgrade_module method, dynamically override the set of active (via PAR.pm) PAR::Repositories to be empty.
Run the current repository client's upgrade_module method which will attempt to require the module for checking its version.
Afterwards, check whether the module was loaded using %INC.
If not, continue normally, probably ending up failing to load the module from anywhere or loading the freshly installed copy.
If the module was loaded, prevent an additional loading with an evil trick in the @INC hook:

  my $line = 1;
  return \*I_AM_NOT_HERE, sub { $line ? ($_="1;",$line=0,return(1)) : ($_="",return(0)) };

Even disregarding the slight obfuscation, can you figure out how this works?

One obscure feature of @INC and the module loading is the return value(s) of a subroutine @INC hook. It normally simply returns a file handle that the module code is then read from. But if it returns a code ref as its second return value, that code ref is called repeatedly until it returns false. After each invocation, $_ is assumed to contain the next line of the module code. If the first argument was a file handle nonetheless, $_ is initialized to a new line from the file handle before calling the subroutine.

The motivation here is mostly that we want to set the file contents to "1;". Unfortunately, passing undef as the file handle resulted in the subroutine not being called. This smells like a bug in perl to me, but I'll have to check that more closely with blead. Furthermore, it's not wise to load any unnecessary modules in PAR.pm as they would have to be included verbatim in an uncompressed part of PAR::Packer created executables. Therefore, instead of simply passing a IO::Handle->new(), I'm supplying an arbitrary GLOB ref.

Finally, the subroutine itself simply sets $_ to "1;" in the first invocation and returns zero on the second to stop the evaluation, thus essentially short-circuiting require()'s loop through @INC.

After going through this considerable pain, I got the auto-upgrading feature of PAR::Repository::Client to work. There's probably still bugs and testing it as part of the test suite is no fun (but still feasible).

Stay tuned for a new release of the involved modules.

Cheers,
Steffen