Every so often someone reports weird behavior in #parrot, and someone says "Hey, that looks like a GC bug!"
Most of them aren't. (Most of them lately seem to be that we're changing the way bytecode works, and we don't have all of the dependencies for all of the generated PBC files correct, so you have to run make realclean
and rebuild.)
While adding the vtable override cache the other day, I did create a forehead-slapper of GC bug, but I caught it before I checked in the code. How did I know it was a GC bug? Easy.
The Class PMC itself contains pointers to several other PMCs and GC entities, including the name of the class and its corresponding namespace. I added a pointer to a Hash PMC which maps the names of vtable overrides to Sub PMCs.
I remember thinking at the time "Hey, it's just a cache. I don't have to mark it during the mark phase explicitly. All of the Subs it refers to will stay alive as long as their namespaces live. That's easy."
When I ran the tests, I saw a weird error about not being able to perform a
keyed index PMC lookup on a Key PMC. I set a breakpoint on the
real_exception
function (which reports these kinds of errors) in
the debugger, and the backtrace showed that the cause of the call was my cache
lookup function.
"That's weird," I thought. Then I realized what I had done.
My line of thinking was correct in that I don't have to mark all of the PMCs contained in the cache PMC. They're already reachable from the rootset through the namespace. The GC won't collect them.
The problem is that the cache itself -- the Hash PMC -- is only reachable through the Class PMC. Unless it gets marked as live, the GC will reclaim its header and put it on the free list again.
The Class PMC still has a pointer to that header, but the next PMC allocated from the GC which uses that header will overwrite the PMC's information, effectively morphing my lovely cache into something else. In this case, my Hash PMC turned into a Key PMC.
Usually they're not this obvious, but I've gone through all of the PMCs in Parrot to make sure they mark their contained GCable entities appropriately.