Perl's builtin sort function allows you to pass in any function using the magic variables $a and $b, the reason for this is, apparently, that because sorting is so intensive, it was too much overhead to pass in normal arguments, and the magic variables made it faster. Sadly, this hackery meant that it's also impossible to pass in a sorting subroutine from another package, because $a and $b are magically global within the declaring package only.
perldoc perlsort advises that you can prototype a function ($$) in which case
sort will allow you to use standard arguments e.g. { $_[0] <=> $_[1] }
rather
than { $a <=> $b }
. This will let you generate sort subs in external packages
and pass them in. Woot! Only problem is, it's apparently much much slower.
I played around with this and created some benchmarks which sorted an 10,000 element array of random numbers. Comparing these two with Benchmark
sub sort_arg { return sort _sort_arg @array; } sub sort_ab { return sort _sort_ab @array; } sub _sort_arg ($$) { $_[0] <=> $_[1] } sub _sort_ab { $a <=> $b }gives me:
sort_ab: 16 wallclock secs (15.84 usr + 0.01 sys = 15.85 CPU) @ 31545.74/s (n=500000) sort_arg: 16 wallclock secs (16.11 usr + 0.00 sys = 16.11 CPU) @ 31036.62/s (n=500000)
Making $a and $b slightly faster, but not massively. Of course, the benefit of
using arguments would be to be able to pass in a sort sub. In which case you
can simply dispatch $a and $b to the sort sub (which means that you don't even
need to prototype the sub ($$)
) like so:
sub sort_arg_indirect { my $coderef = \&_sort_arg; return sort { $coderef->($a,$b) } @array; }We'll just compare that with an inline sort sub:
sub sort_ab_inline { return sort { $a <=> $b } @array; }The result of which is
sort_ab_inline: 7 wallclock secs ( 7.96 usr + 0.01 sys = 7.97 CPU) @ 62735.26/s (n=500000) sort_arg_indirect: 9 wallclock secs ( 8.14 usr + 0.01 sys = 8.15 CPU) @ 61349.69/s (n=500000)
Again $a and $b are faster, but the interesting thing to me is how much faster this inline sub style is for both than using the (actually rather bizarre) named sub syntax as above. This pair is ~100% faster than the first pair. But the difference between the two isn't so big.
Rate sort_arg sort_ab sort_arg_indirect sort_ab_inline sort_arg 31037/s -- -2% -49% -51% sort_ab 31546/s 2% -- -49% -50% sort_arg_indirect 61350/s 98% 94% -- -2% sort_ab_inline 62735/s 102% 99% 2% --
(These results are for v5.8.4 built for i386-linux-thread-multi, I previously had similar results on Win32).
If these results are sound, I'd draw the conclusion that if you're writing a module that would benefit from allowing callback sort subs, then there may not be a compelling reason to ignore sort subs working on @_ purely for historical speed reasons.
gets recognized by the perl parser as a pattern, and gets optimized away. In other words: the block is not actually used at all, but simply gets replaced by a native numerical sort. As a result, the above code is about as fast as plain sort without any block, which does string comparison.sort {$a <=> $b } LIST