Hacking on Want

robin on 2005-06-30T23:48:39

It's been a while since I've done any serious hacking on anything Perl-related. Yesterday I woke up to a message from Damian Conway, reporting a subtle bug in my Want module. I haven't been very good at responding to bug reports of late (most of them are EBKAC or known bugs), but I think Damian has earnt the right to be taken seriously.

It took me most of the day to track down the bug and fix it, but it was an interesting journey. What he found is that, if you call want() from (a sub that's called from) within the guard of a loop, it crashes the second time through.

It turns out that this happened because of a subtle design flaw in Want. Perl doesn't really have any proper introspection capabilities, so modules like Want have to be cunning and take advantage of data that's around for other reasons. To decide what context a sub is called in, Want locates the part of the optree where the sub is called, and then trawls it to find the essence of the expression the sub call is in. (For example foo() + 2 means foo is called in numeric context, whereas foo() && 2 means it's called in boolean context.

There's no easy way (that I know of) to find the right part of the optree, but there are various bits of information around that give enough of a clue. The activation record for a sub records the last statement that was executed before the sub call, and the address the sub should return to. So I walk the optree, starting at the last statement, until I find the return address; then I know where the sub must have been called from.

The second time through a loop, however, it can happen that the last statement executed is after the return point, so it keeps walking and walking but never finds what it's looking for.

It took me a while to see how to fix it, but in the end I found a way. It so happens that loops, as well as subroutines, leave an activation record on the context stack, so the new code does this: after it's found the activation record for the sub, it keeps looking up the stack to see if there's a loop around the sub call. If there is, the optree walk starts at the beginning of the loop instead. That seems to fix it.

I'm just waiting for Damian to give the all clear before I release the new version.