The extent that map/grep go to to keep the calling overhead of
the block is horrendous and getting that to work for reduce in
List::Util
was difficult. Doing it with multiple blocks is going
to be potentially very difficult. -- Graham Barr, not exaggerating
how hard it is to work on the parser and optree generator.
Pod::Html
Steffen Müller gave David Landgren a commit bit last week to
take over the maintenance of Pod::Html
. After looking around
the blead directory tree, David wondered where the tests were.
Jan Dubois pointed out their hiding place.
mmm, hand-rolled test harnesses http://xrl.us/bi956
Dave Mitchell looked through the smoke results from March and saw less than half a dozen smokes for 5.10.1-tobe. This led him to ask if some of the regular smokers could schedule a smoke or two on a more regular frequency (especially after 5.8.9 is released).
Bram asked for some help on how to start smoking, such as what the
most desirable combinations are for smoking. One important point
to come out of the discussion was how useful ccache
can be to
cut down smoking time.
chained smoking http://ccache.samba.org/ http://xrl.us/bi958
Nicholas Clark noticed that one of the Google Summer of Code projects was to improve the performance of built-in functions, by getting them to skip the construction of intermediate lists. Nicholas wsa curious as to what was meant by this, since it isn't part of the current TODO list.
Wren Ng Thornton replied that it is an optimisation known as "deforestation" in Haskell parlance, and comes into play when you have a series of chained maps or greps, and a pipeline of SVs between each step. The answer to this is to use continuous functions, which is just a fancy way of saying that they operate on input and output streams.
Wren offered some rewriting strategies that he thought would speed things up. It all began to fall apart when Nicholas explained that during compilation there was never at any point a usable abstract syntax tree (or AST) that could be used as a basis for such manipulations, since the tokeniser and lexer emit what is more or less the final optree directly. Some additional obligatory fixups are then performed on the tree, as well as some peep-hole optimisations, but both of these operations are hopelessly intertwined. A distinct, pluggable optimiser for Perl 5 remains an elusive dream.
It gets worse. Nicholas said it took him a full time week's
worth of work, just to create opcode optimisations for reverse
sort @pig_pen
and foreach (reverse @recusandae)
. It took him
a day or so to remove the srefgen
and ex-list
ops from
the creation of arrayrefs (like [1, 3, 7]
) and hashrefs.
He wasn't sure how long it took for Dave Mitchell to teach the
optimiser to perform in-place sorts for @schlip = sort @schlip
,
or for Yves Orton to achieve a faster if (%hash) {...}
, but
these are the only known examples of optree optimisations in the
past three years. Dave admitted that it was "quite hard".
Dave explained that the naive approach of "look for a long string of ops and replace them by a shorter string" are hard to do and very fragile: they are either easily broken, or they break other things. And Rafael chipped in to say that it is difficult to write regression tests for them to boot.
Nicholas thought a better approach would be to get B::Generate
and co. into the state where one could write optree rewriters in
perl Perl and start to explore where the real wins lie. And it just
so happens that Steffen Müller has been playing around with B
and B::Utils
to manipulate the optree and was beginning to make
progress towards doing just that.
The other alternative that Nicholas came up with was to investigate Larry Wall's MAD work, which purportedly allows one to recover the original source after compilation (although I believe no-one has actually managed to achieve this in the general case).
deforestation http://www.cse.unsw.edu.au/user/dons/papers/CSL06.html
for de trees http://xrl.us/bi96a
ptr-table
funcs, add ptr-table-delete
, and benchmark them Jim Cromie wrote a patch to expose the underlying hashing mechanisms
used by the internals, so that XS code could use it directly. He
wasn't entirely convinced that it was wise to do so, but a factor
of 5 speed-up was nothing to sneeze at. The fact that it might
help Devel::Size
caught Tels's attention, but he wasn't sure he
understood what the patch offered.
the street finds its own use for things http://xrl.us/bi96c
Curtis "Ovid" Poe wanted to know if anyone had ever thought about using forks or threads to create a poor man's transactional memory. Robin Barker pointed to a talk made by Simon Wistow on the subject.
Mark-Jason Dominus made the connection between this question and
a thread from June 2006 regarding reversible debugging. This revived
the discussion about reversible debuggers and missile launches, until
Abigail dragged things back on track, pointing out that rolling
back transactions is a much simpler proposition than rolling the
universe. For instance, a fire_missile()
appears really to
fire a missile, except that in reality it doesn't, not until the
commit()
is issued.
Paul Fenwick thought that if anyone was brave enough to pursue
the idea, they could do worse than use a Safe
compartment to
ensure that no operations that could not be rolled back were
performed.
Simon says http://london.pm.org/lpw-2004/talks/simon_wistow-perl_voodoo.ppt
the p5p thread http://xrl.us/bi96e
Thomas Klausner and the Vienna.pm crew announced the grand opening of their TODO bounty hunter scheme, whereby people who write patches to solve TODO problems earn real money (that is, Euros).
What exactly is a TODO, and what it is worth is a work in progress, and you can find out more about it on their wiki:
http://socialtext.useperl.at/woc/index.cgi?todo_test_bounties
make money fast http://xrl.us/bi96g
/etc/passwd
files than previous Back in October 2007, Rafael Garcia-Suarez committed change #32200 to resolve
a problem on an older OS/X. In newer OS/X versions, a file crucial to the
test suite, nidump
, is not longer available, and thus the test suite
fails. Jan Dubois suggested that scraping the output of dscl
might do
the job instead. Unfortunately he lacked the tuits to do so.
Nicholas Clark said that Jan should refile it as a bug report so that it isn't left behind.
http://xrl.us/bi96i
The latest Unicode specification was released by UCD. Of particular interest was the inclusion of uppercase Uppercase ß (eszet). Tels made a cogent argument for the gradual disappearance of such characters: they are really fiddly to text via SMS. In any event, Perl now does 5.1.0, which is going to simplify the task of people who wish to write domino servers (the game, not the Lotus kind).
http://xrl.us/bi96k
(here, this should be an easy one).
perlmodlib.PL
rewrite Currently perlmodlib.PL needs to be run from a source directory where perl has been built, or some modules won't be found, and others will be skipped. Make it run from a clean perl source tree (so it's reproducible).
substr
Vincent Pit had been sufficiently annoyed by magic in substr
being triggered twice, when once was enough, that he sat down and
crafted an elegant patch to fix it up. He had a couple of doubts
about how to deal with the API change.
Nicholas explained how to resolve that by having the old implementation
shuffle off to mathoms.c
, and writing a macro that exposes the
old name in terms of the new.
old functions never die http://xrl.us/bi96n
they just mathom http://xrl.us/bi96p
In his continuing quest to rid the core of twice-invoked magic,
Vincent also delivered a patch to fix up the magic associated with
\&$x
. He knew there was another possibility of magic being
triggered, but questioned the wisdom of invoking magic for something
as tedious as creating an error message.
a surfeit of magic http://xrl.us/bi96r
PL_AMG_names
and PL_AMG_namelens
static Jan Dubois noticed that a couple of new symbols were being exported for 5.8.9-tobe. Since they really should be private, he made them static in blead. Steve Hay applied the patch, and tweaked regen.pl to get it to keep track of overload.c and overload.h.
Nicholas Clark thought that since 5.10 was out in the wild, it would not be possible for to hide them, since someone might already have discovered a way of using them, and thus removing their public visibility would cause such code to break (or at least, become unlinkable).
http://xrl.us/bi96t
atan2(0,0)
returns 0, not undef
Paul Fenwick noticed a small error in the documentation concerning
atan2(0,0)
, as the result of those arguments is undefined. Paul
felt that perl should return undef
, but in fact it returns 0.
Mark-Jason Dominus wondered if it would be better to have it throw
an exception, like the logarithm of a negative number, or dividing
by zero. Unfortunately that would be almost certain to break a lot
of code in the wild. Paul felt that a warning would be sufficient,
since people would be free to use Fatal
and thus obtain an
exception in due form.
Rafael Garcia-Suarez invited interested parties to look at the
atan2
manpage on FreeBSD, which put forward some reasons why
returning 0 can make sense.
Dave Mitchell then looked at the source and discovered that perl
just returns whatever the underlying C library does. Andy
Dougherty investigated further and determined that some platforms
do indeed return 0 (as dictated by the C89 standard) and some will
also set errno
to EDOM.
Nicholas Clark was of the opinion that CORE::atan2
should
return 0, and that leaves POSIX::atan2
free to call the
underlying library.
getting atan http://xrl.us/bi96v
PerlIOStdio_close
(#46173) Last last year, Steve Peters outlined a scenario where dup
ing
a file descriptor during a close
could cause a file descriptor
to be leaked.
Nicholas Clark admitted this week that since Nick Ing-Simmons's
passing, probably no-one understood how PerlIO
works deep down.
In any event, he thought the code as it stood appeared to be
sufficiently wrong to merit a fix.
This it turn reminded Craig Berry to ask why PerlIOUnix_open
hard-wires the opened file to 0666 wide-open permissions, and
wondered why the code didn't honour the current umask
setting. Dave Mitchell explained that the kernel took care of
that.
http://xrl.us/bi96x
[[:print:]]
versus \p{Print}
(#49302) Given that no-one had been able to reconcile the differences between these two syntaxes (for example, that the former fails to match some things that the latter does), Robin Barker chose to document the differences.
if you can't beat 'em http://xrl.us/bi96z
utf8::valid
rejects characters in \x14_FFFF - \x1F_FFFF
(#51710) Steve Peters wondered whether the patch included in bug #43294
would fix this problem. Which it didn't, but that left him
asking why \x14ffff
was considered to be a valid character.
Chris Hall thought that it was but utf::valid
was also happy
with 0x000000
through 0x13ffff
and 0x150000
through
0x7fffffff
, which left him puzzled as to why utf::valid
was singling out the 0x14xxxx
range.
Chris wondered if the patch Steve was looking at was causing
utf::valid
to reject both 'ill-formed' byte sequences as well as
'non-characters'. Either way, it seemed to be sitting on the
fence and not have a clear purpose.
After that I lost it a bit.
we need a unicode-porters list http://xrl.us/bi963 http://xrl.us/bi965
B::SVOP::sv
(#52284) "Inferno" filed a bug which actually works correctly on a threaded
perl, only non-threaded perls have problems. Reini Urban thought
that the best solution was for B::Size
to die a quick, painless
death, and to use Devel::Size
instead, as it is so much nicer.
bug in march, answer in april http://xrl.us/bi967 http://xrl.us/bi969
Frank v Waveren reported a bug in 5.8.8 that Nicholas Clark determined had been fixed in 5.8.9 to be, although he didn't know off-hand what change was responsible for the fix. Frank tracked it down via the git repository, and identified change #30166 as being the fix.
http://xrl.us/bi97b
lc
/uc
have unexpected side effects inside for loop (#52412) Mike Wver discovered that the following snippet
my $foo = 'A'; for my $bar (uc($foo)) { my $lower_bar = lc $bar; print "$foo $bar\n"; # $bar should still be 'A' }
prints A a
instead of A A
. No-one knew why, but Abigail
pointed out that it was fixed in 5.10.0.
http://xrl.us/bi97d
map
isn't context aware in some cases (#52452) Stefan Wehinger wondered why slightly different nested map constructs use some, a lot, or all available memory. David Nicol made a decent stab at explaining it in terms of lists being reclaimed sufficiently early or not.
Nicholas Clark suggested that the desired behaviour described in the
report can be achieved, along with a sane level of memory consumption,
by rewriting the loops with foreach
instead of map
.
http://xrl.us/bi97f
1807 (+7 -3) http://xrl.us/bi97h http://rt.perl.org/rt3/NoAuth/perl5/Overview.html
Tels announced the release of a brand new Math::BigInt, along with
an updated bignum
pragma, Math::BigInt::FastCalc
and
Math::BigRat
. This release closes out nearly all the existing
bugs, only two remain, at the bottom of the barrel. In the meantime,
Tels is sitting back and waiting to see what the CPAN
Testers make of them.
http://xrl.us/bi97j
Tels wondered if Reini Urban had had time to check out his patch
for Devel::Size
and bleadperl, but Reini was moving house
this week.
http://xrl.us/bi97m
Robin Barker's verbosity tweaks to regen.pl and friends made it in.
http://xrl.us/bi97o
Jan Dubois felt that PL_bincompat_opt
should be exported on
AIX and Windows. Steve Hay thought so too, but realised that Jan
was really talking about PL_bincompat_options
. Applied.
http://xrl.us/bi97q
Jarkko Hietaniemi got H.Merijn Brand to tweak Configure in order to align floating point policies of gcc and cc on Tru64.
http://xrl.us/bi97s
Jan Dubois thought that change #23984 should be integrated into
5.8.x, as it gets corelist
installed on Win32. Nicholas Clark
said that it was already in, the reason being that it help
perlbug
go about its business.
http://xrl.us/bi97u
Andreas König warned that lib/CGI/t/upload_post_text.txt was checked in as binary and wanted to know if it be changed. Rafael said that it was binary for a reason; it was in fact a GIF file.
and patent-free http://xrl.us/bi97w
Jerry D. Hedden ran into trouble with the above file, and Nicholas Clark straightened things out.
all packed up http://xrl.us/bi97y
Paul Fenwick issued an RFC for Fatal
/autodie
exception
handling naming and structures.
http://xrl.us/bi972
Tels clarified a point regarding the use of POD for wiki markup,
explaining that his MediaWiki-Pod distribution on CPAN was a subclass
of Pod::Simple::HTML
that fixes up a lot of the problems that
people encounter when using Pod::Simple::HTML
.
This Week on perl5-porters - 23-29 March 2008 http://xrl.us/bi974
This summary was written by David Landgren.
Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.
If you found this summary useful, please consider contributing to the Perl Foundation or attending a YAPC to help support the development of Perl.
Damn. Please accept my sincere apologies. I'm pretty sure I just cut'n'pasted your name from the e-mail message I was reading. Fixed.
I can now type Jarkko Hietaniemi and Aristotle Pagaltzis without a second thought, but I don't trust myself with much else