I was getting ready to go see some music last night in preparation for the New Year when I got the call from work. "Hey Fred, there's a problem with the mission critical application that keeps our business running." Ugh.
Logged in, checked out the issue. Class::DBI was failing on a retrieve with id -2147483547. Huh? A negative object id? I'm pretty good with numbers, and that number looked damn close to 2**32/2, probably comes from my days writing Assembler and C. I got a sinking feeling in my gut that there was a serious problem in our codebase that I wouldn't be able to fix easily. Was this a y2.007k-1 bug?
I dug through the call stack and after some furious debugging found that the object id was being changed from 2147483749 to -2147483547 between two components. Oh great, a 32 bit int bug in our application framework. Fun.
After a few more minutes of hacking, I was at a loss. Then I looked at the way the component was being called:
<% 'foo.cmp', obj_id => sprintf("OBJ_%d", $obj_id) %>
Hrm. I'm not that familiar with the guts of perl, but... could it possibly be that sprintf was using a signed int?
phred@pooky ~ $ perl -e 'print sprintf("foo %d", 2147483749);' foo -2147483547
Phew!! I gave ops the good news and replaced the sprintf with a more sane quoting operator. I guess whoever wrote that sprintf didn't figure we'd ever process over 2 billion objects!
My quest was not done yet. I went into my copy of 5.8.8 source, hunted around a bit, and then found:
int PerlIO_sprintf(char *s, int n, const char *fmt, ...) /* quick definition of c int type added for the lazy rusty C reader */ int 4 32 -2,147,483,648 -> +2,147,483,647 ( 2Gb)There it is. Is it a bug? Depends on your point of view I guess. I would have to say no - we are using a 32 bit platform, so something like sprintf should be expected to behave that way. All in all, things ended a lot better than I expected. Looks like I'll be able to go see music tonight :)
Happy New Year!
The problem might be far more serious then, than the easily fixed issue now. Now might be a good time to test that, now that the problem is still fresh in your mind.
I am guessing it'll take roughly the same amount of time to get to that point, as the app has been in use already. So, you have a reasonable estimate how much time you have left to fix it.