It's Cheaper at Compile Time

chromatic on 2008-04-21T07:09:01

I've long used part of the Rakudo-building process as my benchmark for Parrot optimizations. It doesn't exercise all of Parrot by any means, but it pushes the garbage collector, the string system, and the object system pretty hard. It's also the longest part of the build process anywhere, so improving performance there has a dramatic effect on productivity.

I've long known that it appends to and concatenates strings very frequently. The corresponding Parrot functions always show up near the top of the cost list from Callgrind. I've spent a few hours here and there trying to speed them up, but short of rethinking them entirely, I've never found a good solution.

We don't (yet) have a PIR-level profiler like Callgrind (but I'm working on it). Sometimes I can read through the code and see things that might be expensive, knowing what eventually gets called and where. On a whim, I browsed through the generated code for Rakudo to find concatenation. One idiom jumped out at me:

$S0 = concat "PIR", ':"'

This line of code catenates two constant strings into one, and assigns the result to the virtual string register $S0. It's not terribly inefficient; the two constant strings are in the constant table in the bytecode, so they don't get recreated on execution. However, the result is always going to be a constant string.

There's no reason the optimizer couldn't rewrite that line of code into:

$S0 = 'PIR:"'

It didn't. Now it does.

The optimizer already knew how to optimize away constant addition, subtraction, and other math operations. All I had to do was tell the opcode generator not to generate the concat_s_sc_sc op (concatenate two string constants and put the results in a string register -- the destination register always comes first in the argument list for an opcode), add the concat operator to the list of rewritable three-register opcodes, and add a case to the rewriter where the destination register is a string register.

The result is a modest 5.5% speedup in my benchmark. Part of the gain is avoiding run-time concatenation, but part of it is also using fewer GCable entities. The less garbage you create, the less garbage you have to collect and the less frequently you have to collect it. 5.5% isn't a huge gain, but it's a step in the right direction. Every second we shave off of the hack-compile-test cycle is an improvement, and these optimizations help all Parrot programs as well.

(Are these entries interesting? I seem to get more feedback when I complain about various Technology Distortion Fields.)


interesting

steph on 2008-04-21T07:23:48

Yes. These entries are interesting, at least for me. So far my only parrot "thing" is building it every month on cygwin and "complaining" when I cannot. Lack of time so far for more...but thanks to your posts (and the posts of others here) the day I can really join the effort seems closer.

Thanks for your work. Cheers. --stephan

Are these entries interesting?

nicholas on 2008-04-21T07:45:33

Yes.

And the longer answer - thanks for plugging away at the code, and thanks for describing your findings.

Keep 'em coming

Ovid on 2008-04-21T08:37:34

Here's another ++ from me.

And didn't you say something to me on a mailing list that instead of searching for micro-optimizations that I should be looking at algorithmic improvements ;)

Re:Keep 'em coming

chromatic on 2008-04-21T16:45:14

True... but I haven't found any algorithmic improvements in the past week, and I still believe I can get it faster.

Way more interesting than the math stuff :)

tlbdk on 2008-04-21T08:40:01

For me at least this is very interesting. You could skip the math stuff for my sake, but I guess other people find that interesting.

Keep up the good work

Tim Bunce on 2008-04-21T09:04:17

It's appreciated.

Yes Yes

lestrrat on 2008-04-21T10:39:57

I'm still not up to par yet, but I'm slowly trying to figure out what you are doing, may be send in a patch or two one of these days

No comment!

kaare on 2008-04-21T11:08:09

I guess the missing comments are because most people don't have a lot to add to Perl 6, Parrot and Rakudo entries.

Ranting, on the other hand...

(And I, for one, think that 5.5% is a huge gain for one simple optimization).

More interesting then your technology rants

Limbic Region on 2008-04-21T12:30:51

There was a time when I read every journal entry - just as I read every post on PerlMonks. I don't have the time any more. Your parrot posts pass the filter test for my time so please, keep them up.

Yup, interesting

Matts on 2008-04-21T12:59:54

I like your technology rants too though.

And I agree with the above poster - an overall 5.5% speedup is HUGE. If that were on our infrastructure you'd have saved us over a million dollars.

Absolutely

amahabal on 2008-04-21T13:21:22

As I try to finish my dissertation your blogs (and other P6 blogs) keep things bearable...

What he said ...

duff on 2008-04-21T13:39:32

Limbic_Region said almost exactly what I wanted to say. I read selected journal entries and yours always pass muster. Keep up the good work.

extra concats

pmichaud on 2008-04-21T13:53:00

...One idiom jumped out at me:
        $S0 = concat "PIR", ':"'

Interesting -- apparently PGE is generating these concat statements whenever there is PIR embedded in a rule. For example:


token MARK_STATEMENT_END {
        {{ $P0 = match.'to'()
                $P0 = clone $P0
                set_global '$!endstmt', $P0
        }}
        <.ws>
}


It's good that imcc is now optimizing the concat_s_sc_sc opcode, but PGE and the other compiler tools should probably also be smart enough to generate the concatenated constant in the first place. I'll probably fix that as well.

-----

Also, as others have commented, I really appreciate these insights into Parrot internals.

Pm

Re:extra concats

pmichaud on 2008-04-21T15:11:55

Based on chromatic's analysis I re-examined the section of PGE that was responsible for generating the concats in the first place, and nearly all of the concatenations it was performing at runtime were constants that could be done at compile-time. I must've been asleep when writing that code.

So, I refactored the code to avoid the concats altogether, and building rakudo's actions.pl file is now 1.6% faster than it was before (44.363 versus 45.095 seconds on my box). Not a huge speedup, but I'm still happier knowing that we aren't doing a lot of unnecessary and repeated concat operations -- especially since they were in code that gets called a lot (such as when scanning over whitespace and comments).

Thanks for the analysis and updates.

Pm

Re:extra concats

Aristotle on 2008-04-22T02:22:51

Suggestion: more people will see this if you post it to your own journal. :-)

post moar parrot

rjbs on 2008-04-21T14:14:04

I, too, enjoy reading these notes. If it would help, I would reply, "Interesting!" to each one in the future. I just usually have nothing much to add.

interesting

matt.fowles on 2008-04-21T14:22:01

chromatic~

I strongly prefer these posts to your posts complaining.

Matt

c++

petdance on 2008-04-21T15:46:17

also enjoy them. Would be nice to have in rakudo.org as well.