A co-worker and I traced down an incredibly annoying bug today. Our client/server app would just seem to hang after a while. Oh, the process was still running and there was nothing in the error log, but it just wouldn't return anything any more. At first, we assumed the problem was with the server itself, so there was lots of debugging going on there. To make matters more interesting, this only seemed to happen on Linux.
We finally narrowed it down to a file descriptor issue - the server was sucking them all up. Eventually, it would use them all and then it would hang. More debugging of the server ensued.
We *finally* narrowed it down to sys-proctable, a C extension based on Dan Urist's Proc::ProcessTable. So, somewhere I managed to eff up my C code so that it's not letting go of a filehandle somewhere. My attempts to shift the blame to Dan failed - his code does NOT suffer from the same bug (I checked).
The really weird thing is that everything seemed be ok about 2 weeks ago. Why wasn't it failing before? I have no frikkin clue.
It's that kinda stuff that gets my stomach in knots. Now I feel like a complete knucklehead for stressing over this for the last week or so.
UPDATE: Checking the C source I see that I have 4 calls to open() and 3 calls to close(). That could be a problem...YA THINK? All better now. :)
Re:all file handles
djberg96 on 2003-07-16T18:24:07
Nope - it was definitely a bug in my code. I just nailed it. Fortunately it was a simple procedure. It went something like this (imagine me counting this stuff manually):Number of "open" calls...1..2..3..4.
Number of "close" calls...1..2..3...hmmmm..Five minutes later I had it fixed and created a little test with an FD counter to make sure. All better now.
:)