I'm a relatively long-time reader of the perl5-porters mailing list. Somewhat recently Nicholas Clark posed a few small challenges intended to draw more people into the Perl core development. I thought it was a great idea, but couldn't follow up on it at the time. I said I liked the concept on the #p5p IRC channel and so I thought I should learn a bit more about the Perl core and XS. While not the same, I presume that having strong knowledge about the XS/Perl API would be a jump start to understanding the core.
Skip ahead a few weeks. I have since submitted my thesis, went on vacation, and started a new job. But still no progress on my plan to learn XS. Until yesterday. I was idly playing with the B and B::Utils modules when I had a pretty good idea for an interesting learning and experimentation project: AutoXS.
Essentially, the idea started out with using B to scan a running Perl program for subroutines or methods of a particular type. Typically, the simplest and most recurring methods are accessors for hash-based objects. (Just search CPAN for accessor-generators...) The next step is to replace the identified objects with precompiled XSUBs that accomplish the same task but having been written in C, doing so faster.
For simple accessors, that seems like a simple enough task at first: Write the XS code to access a value in a hash reference which is stored on the stack. Apart from the fact that it took me surprisingly long and a lot of patient help from the friendly people on #p5p to get the XS right (thanks!), this may seem like a simple enough task at first. But where's the hash key coming from? You can't expect the user to pass it in as an argument because that's beside the point. You can't know the key name at XS compile time because that's when the module's built. You currently cannot find the package and name using which the current method/subroutine was called either. So what's the answer? Something like currying. I don't think I need to explain to anyone what that is. But maybe I should mention that it's in C, not Haskell or Perl. C doesn't have currying.
The solution took some time in coming. The XS ALIAS keyword allows for compile time aliases to a single XSUB. The aliases can be distinguished from within the XSUB by means of an int variable whose value can be associated with the aliases. (Bad explanation, I guess, have a look at perlxs for a better one.) This doesn't get us all the way to currying, though. I had a look at the generated C code and realized that I could just write similar code on my own and assign new values of that magical integer to each new alias of the accessor prototype at run time (CHECK time, really, but run time would work, too). Then, all that was left to do was to put the hash key for the new alias into an array indexed with said numbers. Voila - fake currying for that XSUB.
By now, it all actually works. The scanner indentifies quite a few typical read-only accessors. The XSUBs are, according to my crude measurements, between 1.6 and 2.5 times faster than the original accessors. If you're calling those accessor methods in a tight loop, that might actually make a bit of a difference. I wrapped it up in a module, AutoXS, and gave it the best interface ever. That is, none. You just say
use AutoXS::Accessor;
to get the accessor scan for all methods in the current package. More seriously, one could let the user flag eligible methods or even apply the scan globally. But that's not the point. It's just kind of fun that it works at all.
Cheers,
Steffen
Re:Another one to look at
tsee on 2008-04-02T07:43:13
That should be happening anyway, right? Perhaps only if you write sub () {1}.
The next thing I might tackle is setters. However, since those come in a much wider variety than getters, it'll take some better matching tools than what we have now in B::Utils. Just look at the code of AutoXS::Accessor for a sample of what kind of cruft I produced.
So as a yak shaving exercise, I'd like to write two routines: One that produces a pattern/condition structure for B::Util's opgrep() from an existing op tree and one that, given two such patterns, does an alternation of them. This is already there as op_or(), but what I'd want it to do is merge as much of the structures as possible:
op_or(
{ name => 'foo', first => { name => 'bar' } }
{ name => 'foo', first => { name => 'baz' } }
)
currently becomes
{
disjunction => [
{ name => 'foo', first => { name => 'bar' } }
{ name => 'foo', first => { name => 'baz' } }
}
whereas it should be in an ideal world:
{
name => 'foo',
first => {
disjunction => [
{ name => 'bar' },
{ name => 'baz' }
],
}
}
Or, even better:
{
name => 'foo',
first => { name => ['bar', 'baz'] }
}
But then again, I'm probably not going to have time to do any of this any time soon. My XS learning time is spent for now and I have to get back to work and - Yay - my first teaching assignment.Re:Another one to look at
Alias on 2008-04-02T08:57:46
sub () { 1 } gets you part of the way.
But the XS version is even faster again.
Re:Another one to look at
Aristotle on 2008-04-02T09:11:56
How can a sub call be faster than compile-time constant folding?
Re:Another one to look at
tsee on 2008-04-02T15:56:51
I just wrote the code to produce opgrep patterns from op trees (I.e. B::OP objects). Expect B::Utils 0.05_06 on CPAN soonish.
Oh, and there is a typo in my last comment. The second code snippet is missing a closing ]. Sorry about that.
Re:Class::Accessor::Classy and/or Moose
tsee on 2008-04-02T19:18:38
You're right, of course. The code is really two quite separate parts: The B-based scanner which identifies targets and the XS/ALIAS hack which instantiates a new "curried" alias to the XSUB accessor. It should be simple to let the user explicitly flag subs for replacement instead of scanning. (Though the preposterous namespace choice doesn't make sense any more then.)
Actually, it would be even simpler to have a module and interface:
use Class::Accessor::XS::I::Am::Sure::That::Namespace::Is::Taken
get_foo => 'foo',
get_some_property => 'propertyname_aka_hashkey',
;
The code is there. One would just have to write the import() sub.
Perhaps I'll do that and just use it in AutoXS::Accessor with a replace_existing_subs_yes_I_fucking_mean_it => 1 option.:)
Other takers welcome, no question.
Re:Class::Accessor::Classy and/or Moose
tsee on 2008-04-03T15:00:50
I just uploaded Class::XSAccessor to PAUSE. It contains the code to install the fast XS accessors into user packages. It also supports setters.