This Week on perl5-porters - 27 October-2 November 2008

grinder on 2008-11-11T23:08:00

This Week on perl5-porters - 27 October-2 November 2008

"So we're doing things at BEGIN time in UNIVERSAL with source filters. With a garnish of messing with @ISA in some other classes. What could possibly go wrong?" -- Nicholas Clark, trying to address serious bugs lurking somewhere in code this funky.

Topics of Interest

perl@34559

More feedback on the march to 5.8.9. Dave Mitchell fixed up the problems with SUPER method caching. Nicholas uncovered a can of worms in Attribute::Persistent.

Indeed, imacat had found the problem six months ago.

File::Path was the long pole holding up the tent for the past couple of weeks (my fault). 2.07 is now on CPAN, and addresses the symlink flaw that the Debian project uncovered.

  slogging on
  http://xrl.us/ow3g3 

Fix for failed Gconvert detection under C++

Tony Cook fixed up a C configure probe that wasn't valid under C++. H.Merijn Brand thanked him and stowed the change away in the metaconfig machinery.

  better C park
  http://xrl.us/ow3g5 

@{"_<$filename"} is unreasonably tied to use of DB::DB ($^P & 0x2)

Tim Bunce learnt that you can't introspect the source code of files without dropping down into single-step mode. This made him sad, because it meant that evaled code was more or less out of reach of his current obsession, Devel::NYTProf.

Not letting such a minor issue get in his way, Tim proposed a patch to give him sufficient fine-grained control to achieve his ends. Nicholas Clark found the time to run with the patch and produce something that was compatible with 5.8.9-RC1 (yes, coming soon to a mirror near you).

  profiling evals
  http://xrl.us/ow3g7
  http://xrl.us/ow3g9 

Perl Unicode bug

On the continuing saga of Karl Williamson's single-handed battle to slay the Unicode deficiencies in Perl, Juerd Waalboer commented that use ascii had a lot going for it as a putative pragma to deal with the matter. Rafaël wanted to see 'legacy' appear somewhere in the name.

Tom Christiansen penned a fine missive on the perils of \w, \s, \b and even \d, pointing out that a certain dingbat character is classified as a digit, but alas, MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL PI isn't. Which is he's given up on all those shortcuts. As a parting gift, he offered some code to check ASCII characters against all Unicode properties and so forth.

Yves Orton had a look at the matter from the perspective of the regular expression engine, and felt that adding /a and /u modifiers to enforce ASCII or Unicode semantics would be horribly painful to implement, and said that he was leaning more towards ASCIIish semantics (for instance \d, \w and \s would recover their pre-Unicode meanings), the idea being that people doing Unicode can use properties instead.

  this property is condemned
  http://xrl.us/ow3hb 

perl5db questions

Edward Peschko had some questions about how the Perl debugger interacts with the perl binary, so that he could debug the debugger debugging a Perl script with gdb. Nicholas Clark pointed him in vaguely the right direction. Richard Foley suggested quite innocently to Edward that if he came up with anything useful it would be worthwhile patching the debugger or Devel::Trace.

  a debugger's debugger
  http://xrl.us/ow3hd 

Why are the file test operators in perlfunc?

Michael G. Schwern wondered with the documentation for file test operators (-f, -s ...) lives in perlfunc rather than perlop. Eirik Berg Hanssen pointed out that if such a change was made, then he would miss the perldoc -f -X shortcut.

David Nicol pointed out that perlfunc also mentions last, next and redo, which are flow control syntax. And thus it's easy to get at them with perldoc -f too.

Renée Bäcker had a patch lying around in an RT ticket (bug #27886) that, if applied, would extend the -f switch to look inside perlop as well.

In the end, the problem is no so much where the documentation lies, but rather one of how easy it is to obtain.

  http://xrl.us/ow3hf
  http://xrl.us/ow3hh 

RFC: version.pm qv() confusion

Following on from the humonguous threads about version comparison code (and the injection of things into the UNIVERSAL namespace), John Peacock put forward a proposal to try and put an end to the confusion that surrounds the issue of handling version numbers in Perl.

Michael G. Schwern penned a very thoughtful reply describing the problems in terms the mental representations that people have developed to remember how (they think, correctly or otherwise) version numbers work. He continued by reviewing the documentation, suggesting better examples, and pointing out that the fact that 1.2 is considered greater than 1.3.0 is a trap waiting to fool the unwary.

Dave Golden also chipped in with a number of sensible suggestions and volunteered to rewrite the documentation in terms of whatever changes were finally agreed upon.

  http://xrl.us/ow3hj

  http://xrl.us/ow3hm 

CPAN-1.9301 can't clean bootstrap .cpan

Nicholas Clark uncovered a problem with the version of the CPAN shell slated to be included in 5.8.9. Since the p5p summariser was busy with other matters In Real Life and had not wrapped up the final version of File::Path, Nicholas had the time to confer with Andreas König to determine the right thing to do.

  http://xrl.us/ow3ho
  http://xrl.us/ow3hq 

Deprecating Time::Local?

Dave Rolsky was so impressed by Time::y2038 that he planned to rewrite Time::Local in terms of it, and add a warning saying it was deprecated and just use Time::y2038 instead kthx.

Jesse Vincent wanted to know if Dave meant a warning in the documentation, rather than the code, since there must be an awful lot of code in the DarkPAN that uses it. Dave did indeed mean a disclaimer in the documentation.

He had a second look, and decided that it wasn't even worth the hassle of trying to do anything, other than just recommending people use Time::y2038 and be done with it.

  http://xrl.us/ow3hs 


TODO of the week

A task that needs some Perl and internals knowledge.

Deparse inlined constants

Code such as this

  use constant PI => 4;
  warn PI;

will currently deparse as

  use constant ('PI', 4);
  warn 4;

because the tokenizer inlines the value of the constant subroutine PI. This allows various compile time optimisations, such as constant folding and dead code elimination. Where these haven't happened (such as the example above) it ought be possible to make B::Deparse work out the name of the original constant, because just enough information survives in the symbol table to do this. Specifically, the same scalar is used for the constant in the optree as is used for the constant subroutine, so by iterating over all symbol tables and generating a mapping of SV address to constant name, it would be possible to provide B::Deparse with this functionality.

checkpods -> podchecker

Woohoo! Alex Vandiver took a crack at solving the checkpods/podchecker redundancies, and supplied patches galore to do the deed.

  TODO: patch review
  http://xrl.us/ow3hu 


Patches of Interest

Module/Build/t/compat.t failure

Robin Barker sent in a patch to fix up an error that occurs when the PREFIX environment variable is set. Michael G. Schwern thanked him for his work, and tossed in a couple more suspect variables names into the mix.


  http://xrl.us/ow3hy 

Explicit empty while loops

Robin also made a couple of changes in op.c to change while(cond); constructs to while(cond) {}. He thought this made the empty loops more explicit, and besides, it silences a g++ warning. Rafaël Garcia-Suarez applied the patch in the following month.

  meanwhile
  http://xrl.us/ow3h2
  http://xrl.us/ow3h4 

Large omnibus patch to clean up the JRRT quotes

Tom Christiansen went through all the C source files and corrected the Tolkien quotes that appear at the top, and explained why it was so important. Johan Vromans wondered why Tom had cited page numbers, since these may change from edition to edition. Tom thought that a good place to note the editions used would be in perlhack, since only hackers would be likely to encounter them in the source. This was accepted.

Jan Dubois gathered all the Tolkien quotes Sarathy used to announce the 5.005 builds.

Tom went back and produced a second patch, this time against blead. He noted that ext/threads/shared/shared.xs contains a non-Tolkien quote, and wondered if a better Tolkien quote might not be found, should the original authors of the file agree. Artur responded positively to the idea.

Working further afield, Tom found another quote that could be applied to ext/Win32CORE/Win32CORE.c.

  The Road goes ever on and on
  http://xrl.us/ow3h6
  http://xrl.us/ow3h8 

Be more explicit about magic @ARGV

Moritz Lenz wrote a documentation patch to explain that <> doesn't open files from @ARGV, but passes them to open() instead. After careful review, it was applied.

  http://xrl.us/ow3ia 


New and old bugs from RT

Cwd::realpath doesn't work on files on Windows (#29570)

Andrew Pimlott could no longer recall how to trigger this bug and suggested it could be closed.

  reply to reopen
  http://xrl.us/ow3ic 

perl-5.10.0 glibc detected *** free(): invalid pointer: 0x553c6700 (#51238)

Warren Dodge was pleased to hear that Michael J. Krueger was experiencing the same problems with a module from Rational that was failing on 5.10. He lodged a bug report (a PMR) with Rational and wanted to know if Michael gave him permission to forward his report to Rational as well. No word back from Michael.

  http://xrl.us/ow3ie 

Document $var, $arg, $type and $ntype XS variables (#51992)

In response to Michael G. Schwern's plea for better documentation on matters XS, Renée Bäcker replied with a web page that he found useful. Living as it does in a ~person home page, it would be good to get it onto the Wiki. (Hint hint).

  but ask for permission
  http://xrl.us/ow3ig
  http://www.perlfoundation.org/perl5/index.cgi 

semi-panic: attempt to dup freed string (#54114)

It has to said: people make Perl do the strangest things. Consider the following program:

  my $r = f();
  my @a = @$r;
  sub f {
    push @a, undef;
    return \@a;
  }

It works, after a fashion, but the interpreter is left dithering as to whether it should panic or not. Six months later, Dave Mitchell gave his analysis on the underlying cause.

  half way fixed
  http://xrl.us/ow3ii 

m/a{1,0}/ compiles but doesn't match a literal string (#56526)

Is now an error in blead.

  and we test!
  http://xrl.us/ow3ik 

Can't use v[0-9]+ as label (vstring) (#56880)

Renée Bäcker sent in a patch to correct this problem, but Rafaël didn't like the patch format, and also asked about a boundary condition (labels with colons). Renée replied with a better patch, but wondered if the patch responded to Rafaël's initial criticism.


  like a version, patched for the very first time
  http://xrl.us/ow3in 

chr(0400) =~ /\400/ fails for >= 400 (#59342)

Yves Orton dropped by to say that he thought it was fundamentally impossible to reconcile octal escapes and backreferences within a regular expression, and that by perl 5.14, octal escapes in a regular expression should be illegal. For instance, the interpretation of \17 is either the seventeenth back-reference or chr(15), depending on spooky action at a distance.

Karl Williamson reiterated his request for a verdict on whether his patch is worthy or not (the main sticking point being whether it should be silent, a warning or an error, contingent as it is on machines having other than 8-bit bytes). Glenn Linderman replied, but said that the final cut belonged to the pumpking.

  perl on a UNIVAC 2200 series?
  http://xrl.us/ow3ip 

Program to look at char class complements (#60156)

Karl Williamson noted that some characters are matched by both a POSIX character class... and the complement of the same character class. Ideally, all characters should be matched by only one or the other, not both.

Yves Orton ran some code to probe the Unicode space, and discovered a distressingly high number of Unicode characters with the same behaviour. He explained that the problem is essentially due to a speed optimisation, and the difficulty is reconciling not slowing down non-Unicode matches against a complete rewrite of the character class implementation.

Yves also noted that, to a certain extent, some of the problems are of our own making, such as a discrepancy between what POSIX defines, and what is implemented in mktables. Rafaël deflected some of Yves's criticism by pointing out that perltodo already stated the how the current situation was broken. Yves continued with a post mortem of decisions past, pointing out where and when we messed up when bringing Unicode handling into Perl.

In another sub-thread, it took considerable traffic to define the exact POSIX equivalency of \w.

  the "you can have your pie and eat it" bug
  http://xrl.us/ow3ir 

Unhelpful error message from unpack (#60204)

Nigel Sandever noted that unpack 'v/a*', qq[a] spat out a '/' must follow a numeric type in unpack which was less than helpful for understanding what the problem was. Marcus Holland-Moritz agreed that the message stank and wrote a patch to make things a little clearer. Rafaël was not entirely convinced.

  http://xrl.us/ow3it 

Stacked file operators (#60214)

Abigail discovered that -s -f 'zero-sized-file' works, but -f -s 'zero-sized-file' doesn't. Rafaël fixed it.

  http://xrl.us/ow3iv 

mro::method_changed_in(..) ignores AUTOLOAD (#60220)

Laurent Dami discovered that dynamically created AUTOLOAD routines in parent packages aren't seen by previously dynamically created child packages. Tony Cook offered a patch to correct this situation.

  I told you AUTOLOAD was evil
  http://xrl.us/ow3ix

  http://xrl.us/ow3iz 

Changing $#array in local sub array affects global $#array (#60222)

  probably ENOTABUG
  http://xrl.us/ow3i3 

Method cache not updated when dynamic subclass loaded through Storable::thaw (#60232)

Storable will require a module when asked to thaw an object. Having a coderef in @INC to catch requirements of dynamically created classes used to work in 5.8, but Laurent Dami discovered that it is broken in 5.10. He provided a snippet to demonstrate the problem, but no-one had a solution.

  http://xrl.us/ow3i5 

Broken regexp behaviour for strings produced by Crypt::Rijndael::decrypt (#60246)

"vvv" reported a problem with Crypt::Rijndael. Nicholas traced the problem down to two sources. The first one was with Crypt::Rijndael not correctly terminating C strings with a binary \0. The second was with the regular expression engine incorrectly relying on C string behaviour and looking for a zero-terminated string, instead of using the internal length attribute of the string. He suspected that it might be possible to generate an incorrect result against a Perl string containing an embedded binary zero.

  we need a test
  http://xrl.us/ow3i7 

croak(0) crashes (#60262)

Marc Lehmann uncovered a flaw with croak but elicited no comments.

  http://xrl.us/ow3i9 

threads::shared resets %hash iterators (#60294)

kbrintn showed a problem with hashes in threads, and how it becomes impossible to iterate over the keys within a hash of hashes.

  http://xrl.us/ow3jb 

Perl5 Bug Summary

  1341 (+11 -9)
  http://xrl.us/owaqw
  http://rt.perl.org/rt3/NoAuth/perl5/Overview.html 


New Core Modules

constant 1.17
  http://xrl.us/ow3jd 
Time::Local 1.19

Dave Rolsky uploaded a new version of Time::Local for the 5.10.1 and 5.8.9 maintenance branches, grumbling over the fact that core was patched instead of having the changes dealt with first in the CPAN version. This is an interim release until Dave and porters figure out how to deal with the module in the light of Michael G. Schwern's Y2038+ work.

  http://xrl.us/ow3jf 

This followed on from a thread started by Jan Dubois, who was anxious to see Time::Local working on 64-bit platforms post-2038.

  http://xrl.us/ow3jh 

In Brief

Andreas König ran into a warning with installperl and offered a patch that Nicholas applied.

  http://xrl.us/ow3jj 

Karl Williamson wrote some code and asked the porters to please check if this small code snippet looks correct. But no-one did! I mean really, is it so difficult to confirm the right way to create a scalar value containing a single arbitrary unicode codepoint in UTF-8?

  yes, apparently
  http://xrl.us/ow3jm 

Renée Bäcker found another 7 tickets with patches in RT.

  some assembly required
  http://xrl.us/ow3jo 

Peter Scott wanted to have a warning on abandoned statements, thus independently reinventing bug #59802 (return 0 or die). Elliot Shank said that the problem was already solved with a Perl::Critic policy.

  http://xrl.us/ow3jq 

Steve Hay has a Cygwin installation that is still unhappy with the update to Archive::Extract 0.28.

  http://xrl.us/ow3ju 

Kevin Ryde spotted a flaw in the $Carp::Internal{__PACKAGE__} documentation example (filed as bug #60300). Rafaël Garcia-Suarez amended the documentation accordingly.

  deceptive packaging
  http://xrl.us/ow3jw 

Rafaël Garcia-Suarez applied Renée Bäcker's patch to solve the Too late for "-CS" option problem, thus closing bug #59652.

  applied
  http://xrl.us/ow3jy 

Last week's summary

  This Fortnight on perl5-porters - 28 September-12 October 2008
  http://xrl.us/ow3j2 
  This Week on perl5-porters - 13-19 October 2008
  http://xrl.us/ow3j4 

About this summary

This summary was written by David Landgren.

Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.

If you found this summary useful, please consider contributing to the Perl Foundation or attending a YAPC to help support the development of Perl.