At work, I've got a problem on the back burner which is kind of interesting. We've got some mod_perl processes with big data sets. The processes fork and then serve requests. I've heard from Operations that they're not using Linux's Copy-on-Write feature to the extent desired so I'm trying to understand just what's being shared and not shared.
To that end, I wanted to map out where perl put its data. I made a picture (http://diotalevi.isa-geek.net/user/josh/090909/memory-0.png>, a strip, showing the visible linear memory layout from 0x3042e0 to 0x8b2990. The left edge shows where arenas are. The really clustered lines to the middle show the pointers from the arenas to the SV heads. The really splayed lines from the middle to the right show the SvANY() pointer from the SV heads to the SV bodies.
I kind of now suspect that maybe the CoW unshared pages containing SV heads because of reference counting are maybe compact or sparse. They sure seem to be highly clustered so maybe it's a-ok to go get a bunch of values between two forked processes and not worry about reference counts. Sure, the SV head pages are going to be unshared but maybe those pages are just full of other SV heads and it's not a big deal. If SV heads weren't clustered then reference count changes could have affected lots of other pages.
Anyway, there's a nice little set of pics at http://diotalevi.isa-geek.net/user/josh/090909. I started truncating precision by powers of two to get things to visually chunk up more. So when you look at memory-0.png, there's no chunking but when you look at memory-4.png, the bottom 4 bits were zeroed out.
There's a github repo of this at
It would be interesting to see how many of these are reused storage allocated on pad variable introduction.
In principle the static overhead of PADLISTS, etc could be completely shared (once allocated they never change) but the actual SV bodies they store change all the time.
Maybe priming the callstack by invoking all of the CVs in order to share that stuff could be worth while, though it's probably not that much storage at the end of the day.
Secondly, Stefan O'Rear has been working on memory compaction, look at his stuff here: http://github.com/sorear
and of course there's the recent work trying to make the arenas more pluggable, it would be nice if eventually your efforts could be applied into making smarter allocations (increasing locality of reference between related SVs to reduce page faults or cache misses, for instance).
Re:Stacks and lexicals; compaction
jjore on 2009-09-10T21:48:08
Sorear's work is interesting. I've used http://search.cpan.org/dist/Judy to get compact data as well. While writing the scripts for http://github.com/jbenjore/runops-movie/tree/master/scripts I found I'd often write the code in Perl, then would occasionally share bits of data with some Inline::C.
But separately, my interest right now is in what happens to Linux's CoW. I've got data that is theorized to be both large and unshared between mod_perl processes. I want it both compact and shared.
Re:Stacks and lexicals; compaction
clintongormley on 2009-09-11T10:46:49
(Note: I know nothing about this, so this may make no sense at all)
As I understand it, the heap in Perl contains both code and variables, so if, in a forked process, a variable (which happens to share a page with some code) is changed, then that entire page becomes unshared.
Code seldom needs to change in a new fork. Would it not be possible to separate code and variables, so that the pages occupied by your code would remain shared?
I'd imagine that this would be a net win for memory usage in mod_perl processes, no?
clint
Re:Stacks and lexicals; compaction
jjore on 2009-09-11T14:16:20
Yes, that would be nice. I didn't map out where compiled perl goes in memory and how much it shares pages with things likely to change. It's likely to be intermingled because it's also on the heap.