I've long used part of the Rakudo-building process as my benchmark for Parrot optimizations. It doesn't exercise all of Parrot by any means, but it pushes the garbage collector, the string system, and the object system pretty hard. It's also the longest part of the build process anywhere, so improving performance there has a dramatic effect on productivity.
I've long known that it appends to and concatenates strings very frequently. The corresponding Parrot functions always show up near the top of the cost list from Callgrind. I've spent a few hours here and there trying to speed them up, but short of rethinking them entirely, I've never found a good solution.
We don't (yet) have a PIR-level profiler like Callgrind (but I'm working on it). Sometimes I can read through the code and see things that might be expensive, knowing what eventually gets called and where. On a whim, I browsed through the generated code for Rakudo to find concatenation. One idiom jumped out at me:
$S0 = concat "PIR", ':"'
This line of code catenates two constant strings into one, and assigns the result to the virtual string register $S0
. It's not terribly inefficient; the two constant strings are in the constant table in the bytecode, so they don't get recreated on execution. However, the result is always going to be a constant string.
There's no reason the optimizer couldn't rewrite that line of code into:
$S0 = 'PIR:"'
It didn't. Now it does.
The optimizer already knew how to optimize away constant addition,
subtraction, and other math operations. All I had to do was tell the opcode
generator not to generate the concat_s_sc_sc
op (concatenate two
string constants and put the results in a string register -- the destination
register always comes first in the argument list for an opcode), add the
concat
operator to the list of rewritable three-register opcodes,
and add a case to the rewriter where the destination register is a string
register.
The result is a modest 5.5% speedup in my benchmark. Part of the gain is avoiding run-time concatenation, but part of it is also using fewer GCable entities. The less garbage you create, the less garbage you have to collect and the less frequently you have to collect it. 5.5% isn't a huge gain, but it's a step in the right direction. Every second we shave off of the hack-compile-test cycle is an improvement, and these optimizations help all Parrot programs as well.
(Are these entries interesting? I seem to get more feedback when I complain about various Technology Distortion Fields.)
Yes.
And the longer answer - thanks for plugging away at the code, and thanks for describing your findings.
Here's another ++ from me.
And didn't you say something to me on a mailing list that instead of searching for micro-optimizations that I should be looking at algorithmic improvements
Re:Keep 'em coming
chromatic on 2008-04-21T16:45:14
True... but I haven't found any algorithmic improvements in the past week, and I still believe I can get it faster.
...One idiom jumped out at me:
$S0 = concat "PIR", ':"'
token MARK_STATEMENT_END {
{{ $P0 = match.'to'()
$P0 = clone $P0
set_global '$!endstmt', $P0
}}
<.ws>
}
Re:extra concats
pmichaud on 2008-04-21T15:11:55
Based on chromatic's analysis I re-examined the section of PGE that was responsible for generating the concats in the first place, and nearly all of the concatenations it was performing at runtime were constants that could be done at compile-time. I must've been asleep when writing that code.
So, I refactored the code to avoid the concats altogether, and building rakudo's actions.pl file is now 1.6% faster than it was before (44.363 versus 45.095 seconds on my box). Not a huge speedup, but I'm still happier knowing that we aren't doing a lot of unnecessary and repeated concat operations -- especially since they were in code that gets called a lot (such as when scanning over whitespace and comments).
Thanks for the analysis and updates.
Pm
Re:extra concats
Aristotle on 2008-04-22T02:22:51
Suggestion: more people will see this if you post it to your own journal.
:-)