Memory usage

nicholas on 2010-02-07T16:20:40

There were a few things that caught my attention in Facebook's presentation on HipHop, their PHP to C++ converter. It sounds like it relies on static analysis of the entire program's source, hence why they can't support eval, create_function etc. (22m25s in). I suspect that that sort of restriction would be, um, "interesting", in a general CPAN using environment, as a lot of modules build on various low level code that encapsulates eval, such as the traditional way h2xs did constants via AUTOLOAD. Also, as it's different runtime from Zend, so extensions need to be ported to it (19m in).

However, the most interesting part was a an early slide about memory usage, at 6m20. Transcribed:

150MB
for ($i = 0; $i < 1000000; $i++ ) {
      $a[] = $i;
}


700MB
for ($i = 0; $i < 5000000; $i++ ) {
      $a[] = $i;
}


(700M - 150M) / 4,000,000 = 144 BYTES

Does PHP really consume 144 bytes per integer value? Is that on a 32 bit or 64 bit machine?

For comparison, here is Perl:

$ perl -le 'for ($i = 0; $i < 1000000; $i++ ) { push @a, $i; }; print `cat /proc/$$/statm` * 4 / 1024'
22.4765625
$ ./perl -le 'for ($i = 0; $i < 5000000; $i++ ) { push @a, $i; }; print `cat /proc/$$/statm` * 4 / 1024'
118.44140625

which works out at 25.155 bytes per integer value, or under 20% of their figure for PHP. The odd number of bytes will be the malloc overhead spread across all the structures allocated from the same arena.

I have no idea what the usage of Python or Ruby are like, but there's a comment in the Unladen Swallow wiki:

Here at Red Hat we use Python for a lot of things. What we've observed is that execution performance is not the main issue (although it improving it would be greatly appreciated), rather it's the memory footprint which is the problem we most often encounter. If anything can be done to reduce the massive amount of memory Python uses it would be a huge win. I would encourage you to consider memory usage as just as important a goal as execution speed if you're going to tackle optimizing CPython.