Monkey patching CPAN distros with CPAN.pm

autarch on 2007-09-03T17:53:47

Florian Ragwitz, a Perl guy and Debian developer, brought to my attention a really cool new feature in recent versions of CPAN.pm, called "Distroprefs". It allows you to configure CPAN.pm to do various custom "things" when installing a distro. One major use case is automating interactive installers with Expect, which is obviously useful.

Another thing you can do is tell CPAN to apply a patch to a module as it installs it. To make things extra cool, it can fetch this patch off of CPAN itself, making it easy to create "mini-forks" of a module. Of course, long term, we want to get these patches integrated back into the distro, or create a real fork with a different name, but in the short-term this can be quite handy.

The particular module that Florian and I were discussing is Devel::Cycle. This is in turn used by Test::Memory::Cycle. I use the latter module in various distros where I want to make sure that I'm properly weakening references to prevent circular references, which lead to memory leaks.

Devel::Cycle has a cool feature where it will use PadWalker to peak inside closures as part of its cycle detection. This is great, since it's quite easy to create a circular reference with a closure. What's not so great is that Devel::Cycle is pretty broken when it does this. If the closure contains any references which aren't references to scalars, it blows up.

I ran into this when writing a test for DateTime::Format::Builder, which had circular references that needed removing. I fixed Devel::Cycle, submitted a patch via the RT ticket, and installed the patched version locally. This let me run the tests to prove that the circular ref was fixed, and I uploaded a new version of DateTime::Format::Builder to CPAN. This promptly caused breakage because everyone else's Devel::Cycle is still broken.

So distroprefs to the rescue! I've uploaded my patch for Devel::Cycle to CPAN and you can configure your CPAN.pm to apply it. First start the cpan shell and run o conf init /prefs/ and o conf init /patch/. This lets you tell the cpan shell where to find your distropref files and where to find the programs it needs to apply patches. Match sure to save your prefs with o conf commit if you don't have autocommit enabled.

Then create a file in your distroprefs dir (which is probably ~/.cpan/prefs) named LDS-Devel-Cycle.yml (That's [AUTHOR]-[DISTRONAME].yml). In that file stick this bit of YAML:

---
comment: "Fixes coderef bug"
match:
  distribution: "^LDS/Devel-Cycle-1\.07\.tar\.gz$"
patches:
  - "DROLSKY/patches/Devel-Cycle-1.07-DROLSKY-coderef.patch.gz"

The "comment" bit isn't necessary but it's a nice feature so you know what this is for. Then next time you install Devel::Cycle 1.07, it will apply my patch as well.

There's still a couple things missing to make this really useful. First, after you install the patched version, there won't be any indication that your 1.07 is different from the vanilla 1.07. I'm not sure exactly how this should work, though. My first thought was that the patcher could patch the version # too, to something like "1.07-coderef-patch". That doesn't scale beyond one patch, and non-numeric version numbers always make the toolchain cry anyway.

Second, there's no good mechanism for advertising the existence of these patches. Ideally, this would be integrated into search.cpan.org. It could even let you select some patches and generate the necessary YAML bits for you to store.

The CPAN distro comes with a directory full of examples distropref files. You can browse it via search.cpan.org. There's a ton of examples, and if you click on a module you're familiar with, you'll get the idea of how it all works really quickly.

I think this is a really interesting idea and I give Andreas major props for writing it. If only he'd publicized it ;)