Measuring copy-on-write on Linux

AndyArmstrong on 2009-10-08T19:26:00

http://perlmonks.org/?node=Corion asked about measuring memory allocation in IRC but I was reminded of something I've poked at recently which is measuring how much memory is copy-on-write shared between forked mod_perl processes. Thus far, when on Linux the only answer I know of is to use the exmap kernel module. The main page is http://www.berthels.co.uk/exmap/ but Dave Olszewski wrote some bug fixes for it at http://github.com/cxreg/exmap. exmap uses a kernel module to add a new /proc/exmap file. To read physical page stats, write the PID to this file, then read the results. The exmap distribution comes with a C++ and perl GTK program to interpret the kernel data. Below is what I know of the format for the kernel data.

To use:

$ echo $pid > /proc/exmap
$ cat /proc/exmap
VMA 400000 87
1 0 1c6e7
1 0 1c6e8
1 0 1d328
...

$ grep 400000 /proc/$pid/maps
00400000-00457000 r-xp 00000000 08:01 722755 /usr/bin/screen

The sections provided by /proc/exmap correspond to each of the chunks in /proc/$pid/maps. Each line then details a page, whether it is swapped, and whether it is writable.

(
    VMA $address $page_count
    ( $resident $writable $page_id )+
)+

Anyway, just thought I'd share. If you know a better trick, I'd love to hear of it. When I next get around to improving my search servers I'll likely actually try to use this but for now this is just a tool I think I plan to use but haven't done serious work with yet.


Try smem

autarch on 2009-10-06T21:05:29

http://www.selenic.com/smem/

Re:Try smem

jjore on 2009-10-06T21:34:15

Bummer, smem wants 2.6.27+ but Ubuntu LTS, the server distro, is only at 2.6.24. In theory, the 10.4 Ubuntu release will be the next LTS update and my opportunity to use smem.

mod_perl guide has memory tracking references

bsb on 2009-10-09T02:18:52

I found the old mod_perl guide informative on this topic:

http://perl.apache.org/docs/1.0/guide/performance.html#How_Shared_Is_My_Memory_

One thing I've long wondered about is the effect of reference counting on memory sharing. Does all the incrementing and decrementing of refcounts unshare the entire page the object is on? Can anything be done about this, either within Perl or below?

Re:mod_perl guide has memory tracking references

chromatic on 2009-10-09T03:53:21

Does all the incrementing and decrementing of refcounts unshare the entire page the object is on?

I can't imagine how it wouldn't.

Can anything be done about this, either within Perl or below?

The standard trick in GCs is to move the fields used to track liveness from the objects themselves into a special-purpose structure which tracks multiple objects. The corresponding memory pages get unshared, but the number of pages modified is far fewer.

This also improves cache behavior during GC operations.

Re:mod_perl guide has memory tracking references

jjore on 2009-10-09T13:29:34

Funny you should ask... The chart http://diotalevi.isa-geek.net/user/josh/090909/memory-0.png from http://use.perl.org/user/jjore/journal/39604 shows that at least in that sample program, all the reference counts are tightly clustered. You can write off those pages as unshared *but* you can also likely reasonably expect they aren't taking other pages with them.

Linux::Smaps

robinsmidsrod on 2009-10-12T17:15:41

Have you tried Linux::Smaps?

From the CPAN description: "The /proc/PID/smaps files in modern linuxes provides very detailed information about a processes memory consumption. It particularly includes a way to estimate the effect of copy-on-write. This module implements a Perl interface."

Re:Linux::Smaps

jjore on 2009-10-13T12:54:43

I'd recalled /proc/#/smaps also wasn't available on the 2.6.10. Taking a quick look at Linux::Smaps, it appears I can consume text of the format:

00602000-00c39000 rw-p 00602000 00:00 0                                  [heap]
Size:               6364 kB
Rss:                6340 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:      6340 kB
Referenced:         1064 kB

Apparently a page is counted in Private_* if the only the process being examined has that page. If any other processes also use the page, it is counted in Shared_*. Caveats if you're swapping.

I don't yet know how smem works.

exmap will let me fully find out exactly what's being shared. Or not.

Re:Linux::Smaps

ysth on 2010-03-01T21:12:07

But for practical purposes of controlling apache process growth, just counting the private memory is the right thing to do (or counting all the root process's memory + just private from the kids to get a reasonable total.)

I'm curious to know just what you are seeing shared in apache children that isn't copy-on-write memory?