How to compare and benchmark Perl5 and Perl6 performance?

korpenkraxar on 2009-05-29T22:58:35

Hi all!

I am a big Perl fan and biologist who write and use Perl 5 daily for my studies of animal evolution. Perl is really big in the bioinformatics field, but I guess you all already knew that. Looking at the novel features of Perl 6, it seems as I and others in my field would enjoy using this version of the language even more :-)

Today I finally came around to downloading and compiling Rakudo Perl 6 and Parrot, which to my pleasant surprise was just as simple as detailed here http://rakudo.org/how-to-get-rakudo on my Archlinux X86_64 box. Your mileage may vary depending on OS, but most Linux distributions make it very easy for you to install the prerequisites you need to pull this off. You need stuff like the GNU C compiler, git, subversion and Perl 5, all of which are easily tracked down in your distro's package manager.

To make things clear before I go on: I am not a computer scientist or professional, full-time programmer. I simply use Perl because it lets me solve lots of data parsing, mining and analysis problems very quickly and close to how I think about the problem. That said, I am of course happy for all the performance optimizations I can get while running my perl scripts and programs. Speed of execution matters too, just as speed of development, and Perl 5 strikes a very nice balance for the stuff I am doing.

Using Perl 6 as programming language for day-to-day use in solving problems relevant to me also requires the Parrot interpreter/compiler to produce efficient code. I am unlikely to enjoy the nifty language novelties in Perl 6 if the code is horribly slow compared to what I am used to.

I decided to compare something really simple in Perl 5 and Perl 6 and just increment a number and put that number in the last cell of a growing array.

This is the Perl 5 version:

#!/usr/bin/perl -w my $i = 0; my @numbers; until ( $i == 100000 ) { $numbers[$i] = $i; $i++; }

Perl5: ~0.07s to complete, uses 26KB RAM at completion.

This is the corresponding Perl 6 version (bare with me, I do not know much Perl 6 yet so I really just translated the above code with minimal changes):

use v6; my Int $i = 0; my @numbers; until ( $i == 100000 ) { @numbers[$i] = $i; $i++; }

Rakudo Perl6: ~1m14s to complete, uses 1.4GB RAM at completion.

In this simple comparison, the Perl 5 implementation is >1000 times faster than Perl 6, which also uses >50000 times more memory than Perl 5.

Ouch.

I do not (want to?) believe this difference is due to the the way the new language is implemented and running in Parrot per se, but expect this to be due to lack of optimization and presence of bugs and leaks and perhaps my own ignorance of Perl 6. However, if this arbitrary code snippet is representative of the kind of performance regressions facing people curious to try the Rakudo implementation of Perl 6 out, it may be difficult to attract early adopters outside the core of the community for a while.

The problem is, how can I as a "real-world user" best track the development of Perl 6 from a performance perspective? Are there official or semi-official benchmark scripts around already that we could use?

I have a suggestion, which is to try implementing at least some of the benchmarks over at the The Computer Language Benchmarks Game http://shootout.alioth.debian.org in Perl 6. Is there someone around here who is interested in trying this out? Perhaps we could make a challenge out of it or something :-)

Sounds about right to me

Alias on 2009-05-30T17:20:05

That's about the same kind of thing I saw when I tried it out in January.

What about using a scalpel instead of a hatchet?

Phred on 2009-05-30T20:03:52

Note that I upped the number of iterations to 1,000,000. Does Rakudo have inline C capability? For something that is numeric intensive, you might want a sharper tool.

#include <stdio.h> main() { int i=0; int limit=1000000; int array[limit]; while (i<limit-1) { //printf("I is %d", i); array[i]=i; i++; } } phred@harpua ~ $ time ./a.out real 0m0.015s user 0m0.006s sys 0m0.005s #!/usr/bin/perl -w my $i = 0; my @numbers; until ( $i == 1000000 ) { $numbers[$i] = $i; $i++; } phred@harpua ~ $ time perl p5.pl real 0m0.384s user 0m0.220s sys 0m0.022s

Re:What about using a scalpel instead of a hatchet

petdance on 2009-06-01T18:14:04
I think if you'll reread the article, you'll find that he's a Perl fan, and wants to use Perl 6, but is concerned about what is an obvious performance problem in this part of the code at this time.
The original poster is not asking for the best way to solve the problem of counting up to 100,000.

Finding the bottleneck...

pmichaud on 2009-05-30T21:07:56

For the most part, Rakudo development has been focused more on speed than features. We're also somewhat hampered by the fact that Parrot doesn't have many good profiling tools so that we can figure out where the bottlenecks are. Those are supposed to be in place for the next major release in July.

This post generated a lot of very fruitful discussion on #parrot today. Many people speculated about why the loop might be as slow as it is, and offered suggestions about improving Rakudo's code generation. What we determined is that Rakudo's code generation is in fact pretty good, there's not a lot of optimization to be had there. I already suspected this to be the case, and so when things like this have occurred I've guessed that Parrot is a big source of the slowdown.

But, a few more forensic tests showed that wasn't the case either -- while Parrot doesn't perform nearly as well as we'd like or expect it to, it wasn't the primary source of the slowdown in this code either.

It turns out that the slowness comes almost entirely from Rakudo's implementation of postfix:<++> -- its a very sub-optimal implementation that was written without the benefit of more recent Perl 6 specifications. Replacing the (generated) call to postfix:<++> with a simple integer increment (and changing nothing else) resulted in an 80% improvement on my system. So, there's a bug in the design or implementation of postfix:<++> somewhere (and a couple of optimizations), and I'll be looking at that today/tomorrow.

More importantly, this points out why we really benefit when people write about their Rakudo experiences, good or bad. It helps us to see where we need improvements, to build a stronger case for throwing more resources at the bigger problems we need solved (e.g., parrot profiling tools), and to track down little surprises like this one. I can tell you that nobody involved suspected that postfix:<++> was the culprit here.

Thanks again, and I'll report back with updated timings once we have postfix:<++> fixed.

Re:Finding the bottleneck...

pmichaud on 2009-05-30T21:09:39

For the most part, Rakudo development has been focused more on speed than features.

Oops, I obviously meant to write "...focused more on features than speed."
Pm

Re:Finding the bottleneck...

korpenkraxar on 2009-05-31T10:57:33
Wow! I am very happy to see my post not only being read by the right people and not immediately discarded as "Perl6 bashing", but actually resulting in an immediate identification of parts of the problem.

I actually hesitated to publish this at all, thinking that the problem was on solely on my part. I'm glad I knew better :-)
Re:Finding the bottleneck... an update

pmichaud on 2009-06-01T16:50:36

As mentioned previously, I'm reporting back with our status. In the course of investigating this program, we discovered that postfix:<> was in fact very badly implemented, making it far more expensive than it needed to be. Fixing this ended up requiring quite a few internal changes to Rakudo, and I've discovered even more things we want to fix/avoid when running Rakudo in Parrot.
That said, here's where things stand now. I used the following code as a bench mark (basically same as original, cut the number of iterations down to 10,000):
use v6; my Int $i = 0; my @numbers; until ( $i == 10000 ) { @numbers[$i] = $i; $i++; }

Using the May 26 version of rakudo, this program required 24.1 seconds to complete on my system. With the fixes I checked in this morning, it now requires 6.4 seconds -- a 75% improvement. Yes, this is still incredibly slow as compared to Perl 5, but this was just one optimization, and we have a *lot* more to go. Indeed, I found several more pathological Rakudo/Parrot issues while investigating this one.
While on the topic, I'll note that placing the "Int" type constraint on $i currently makes things run slower, not faster, because it imposes an extra check on each update of $i's value. Without the constraint (i.e., "my $i" instead of "my Int $i") the program runs 0.4 seconds faster. Eventually we plan to do some type-based optimizations that will detect cases like this and optimize further, but in the general case that's a challenging problem and so we're focusing on other items. So, for the time being, keep in mind that type constraints on variables in Rakudo currently tend to make things run slower.
Thanks again for this very useful post!
Pm

Re:Finding the bottleneck... an update

korpenkraxar on 2009-06-01T23:19:42
It seems you are indeed right about the postfix being an important bottleneck. It never even occurred to me to change that to some other way of incrementing $i, but this is how it looks for me doing 10000 iterations:

$i++; # 7.5s

$i += 1; # 3.5s

$i = $i + 1; # 3.5s

In all cases it uses the expected ~130MB RAM, so at least there does not seem to be a significant leak specific to one of these expressions.

shootout benchmarks...

pmichaud on 2009-05-30T22:11:40

I have a suggestion, which is to try implementing at least some of the benchmarks over at the The Computer Language Benchmarks Game http://shootout.alioth.debian.org/ in Perl 6. Is there someone around here who is interested in trying this out? Perhaps we could make a challenge out of it or something :-)

I'd really like to see this happen. There's already a shootout/ directory in the perl6-examples repository [1] -- I'll gladly give commitbits to anyone who wants to work on this or anything else in that repo. Just email me with your github id. If we decide that the benchmarks deserve their own repository, I can set that up also.

I'm also going to see if I can find a place where I can regularly run and post benchmark timings for rakudo, similar to the way we perform daily updates on Rakudo's spectest progress. I've been wanting to report how long it takes to run the spectests, but I need a stable platform to do that from.

[1] http://github.com/perl6/perl6-examples/

Thanks!

Re:shootout benchmarks...

korpenkraxar on 2009-05-31T11:08:10
I'll see what I can do.

Some of the benchmarks are inspired by bioinformatics and I might try having a go at these since I am familiar with these kind of problems (I guess this is a fairly typical stance for "non-programmer" programmers, it being far easier to solve tricky problems if they relate to other concepts you are familiar with, such as biology and DNA data in my case).

However, it will need to wait for a few months since I am hoping to finish my thesis in July :-)

I'll definitely continue looking at and trying out Perl 6 in the meantime.
Re:shootout benchmarks...

dextius on 2009-06-01T19:09:29
I am interested in coding benchmarks for rakudo as well.
Re:shootout benchmarks...

Eric Wilhelm on 2009-06-02T03:14:58

I would be very interested in seeing regular outputs and graphs from such a thing, and would put some effort into writing code as tuits become available.
One thing to consider is having more than one way to do it -- i.e. including multiple versions of a benchmark in regular runs. This would show when things like "$var += 1" and "$var++" have performance gaps or convergence.