top spin

djberg96 on 2002-11-12T14:36:49

Yet more time spent on top last night. I'm down to figuring out how cpu time per process is calculated. Searching Google proved fruitless, as everyone spouts off "getrusage", "clock" or "times(2)" in response to this question, none of which can get cpu time for an arbitrary pid - they can only get their own process info (or their children's). The fine folks of IRC were unable to help, either.

I'm sure it's some stime or utime calculation, but I haven't determined what it is yet, as none of my guesses seemed to work out to a meaningful value.

In better news, the Packers are absolutely crushing everyone in their path. I'm giddy.

UPDATE: The formula is (utime + stime) / CLK_TCK, rounded up (at least, as far as I can tell).

top

gnat on 2002-11-12T16:19:14

There are two ways to get access to the info that top(1) has. The first is to scruffle through /proc, which has always seemed a waste of time to me--between getting a list of files and processing them all, things have changed. The second is to use the kvm_*() system calls and scruffle through the kernel's process table yourself. This is what top does, and it's also subject to the problem that it takes more than one tick for you to do all the scruffling, so the information you're working with is always out of date. It's less lossy than /proc, I think, because it doesn't have the I/O overhead.

We did this as a 300 level Operating Systems project at uni--we had to fork processes, get the process table and walk the binary data structures to display information about them. The reason this is hard and top(1) is worshipped is that every operating system has a different process table structure, and some have different kvm calls. It's impossible to do portably without a zillion "if it's this operating system, do it this way; if it's this operating system, do it another way; ..." #ifdefs.

I've often wondered why the information in top(1) isn't made available through a C library, so that you can write programs that learn all about the system without the hell of writing top(1).

--Nat

Re:top

djberg96 on 2002-11-12T19:13:00
The first is to scruffle through /proc, which has always seemed a waste of time to me--between getting a list of files and processing them all, things have changed. The second is to use the kvm_*() system calls
I'm not sure my pea-brain could handle the kvm api, though to be honest I haven't looked - just heard it was tough. I also thought it was a Solaris-only thing. Guess not. I'm also not sure I want the "more than one tick" issue. For now I can live with data that's .05 seconds out of date.
It's impossible to do portably without a zillion "if it's this operating system, do it this way; if it's this operating system, do it another way; ..." #ifdefs.
One very nice trick I learned from Dan Urist's Proc::ProcessTable module was to create separate source files for each platform, and simply link to the appropriate one based on a uname call (or $^W). This not only prevents a mass of ugly, nested #ifdefs, but makes source control and testing much easier. If I modify the Solaris source, for example, I don't have to go back and retest the other platforms.
I've often wondered why the information in top(1) isn't made available through a C library, so that you can write programs that learn all about the system without the hell of writing top(1).
Actually, Solaris does this to some degree with their psinfo struct. Now, why a platform can't have plain text data under /proc AND a library that provides an interface to it is beyond me. For now, it's one or the other.

Re:top

jmm on 2002-11-13T15:32:12
One very nice trick I learned from Dan Urist's Proc::ProcessTable module was to create separate source files for each platform, and simply link to the appropriate one
The danger to this approach is that many of the platforms will have code that is almost the same, differing only in a few details. When you fix a bug in that mostly common code in one, it is easy to forget to fix the same bug in the others because they have separate copies of the (almost) identical code. You can have the same problem with IFDEF code too, if the ifdefs are used to distinguish large blocks of mostly identical code (although when the code is in the same file, it is easier to remember to check it for the same bug).

Re:top

djberg96 on 2002-11-15T19:24:58
The danger to this approach is that many of the platforms will have code that is almost the same, differing only in a few details. When you fix a bug in that mostly common code in one, it is easy to forget to fix the same bug in the others because they have separate copies of the (almost) identical code.
Ah, but if the code's nearly identical, then it becomes a matter of copy/paste anyway. Fix one, and you know how to fix the rest. Sure, it'll be a little more time, but not that much more time. It would have to be a pretty damned pervasive and nasty bug for me to regret splitting the source files up.
In reality, it simply hasn't been an issue. Perhaps that's because I've been dealing with a rather small codebase and only really targeted three platforms so far - Linux, Solaris and FreeBSD.

Re:top

jmm on 2002-11-16T18:13:17
The keyword there was forget. It's not hard to make a cut and paste common change, as long as you remember to look in all the places that share the code.
My roommate at University made up a law "If there is more than one copy of something, at most one of them is correct." with the addendum that "The number that are correct approachs zero asymptotically over time."
It is harder to forget to check the other variants when they are in the same file (even then, it is still possible to forget to check them [all], so I have sometimes made the tiny portions of the code that were different into macros so that the common code could be written outside of the nest of ifdefs.