This Week on perl5-porters - 17-23 April 2006

grinder on 2006-04-26T16:07:00

Welcome to this week's P5P summary, with all sorts of interesting new stuff on regular expressions, threads and other improvements.

Topics of Interest

Redoing the regular expression API

Yitzchak Scott-Thoennes suggested that Yves Orton send in a patch to pull out some of the ancillary functions in Data::Dump::Streamer in order to make them available in the core distribution.

  http://xrl.us/k2x7 

So Yves did just that. The first addition is to add reftype_name(), that behaves like reftype except that it returns false rather than undef on non-references. This removes the need for fussy make-work code on the client side to avoid warnings.

The second addition is a regex() function, which makes it easier to deal with patterns, whether they have been blessed into other namespaces or not.

Graham Barr admitted that the return value of reftype() was a mistake, and reftype_name() was acceptable, but felt that the regex() function was better off in the Regexp module.

Yves didn't like the fact that Scalar::Util::reftype returns SCALAR instead of something like REGEX. Nick Ing-Simmons liked the idea, but thought that it was too dangerous for maint.

Another sub-thread in the discussion revolved around whether a qr// thing is an object or a type. It is, in fact, an object, but Yves argued that it is much more useful to treat it as a type. Graham agreed to disagree.

Adam Kennedy admitted to using Regexps as objects quite a bit and would be happy to see the Regexp module receive a dose of spring-cleaning (which I suppose means fixing up the reblessing inconsistency that Yves was getting at).

Another hassle Yves pointed out was the non-reversibility of stringifying regexps:

  my $qr=qr/foo/;
  my $str="$qr";
  print qr/$str/; # equivalent but not equal

Dave Mitchell pointed out that a regular expression currently is a scalar, it just happens to have a bit of magic attached...

  Shouldering the weight of history
  http://xrl.us/k2x8 

Silly regexp tricks

Hugo van der Sanden returned to the super-linear cache bug (a || logical or instead of a | bitwise or) in the regular expression engine, and came up with a suitable test case:

  ("a" x 31) =~ /^(a*?)(?!(a{6}|a{5})*$)/;
  print length($1);

This prompted Yitzchak Scott-Thoennes to come up with another bug that showed how blead broke existing behaviour. Since no-one should ever have come to rely on this behaviour, it was all quietly swept under the rug. Dave Mitchell hinted that he was working on New Stuff in the engine.

  http://xrl.us/k2x9 

Bringing threads into the third millenium

Jerry D. Hedden continued to send patches to sync CPAN's threads with blead, first by removing a superfluous counter.

  http://xrl.us/k2ya 
  and again
  http://xrl.us/k2yb 
  and reworked the threads destruct call
  http://xrl.us/k2yc 

He vented his frustration at the slow pace with which the patches were getting applied, believing that he was playing by the rules as much as possible. Rafael was very apologetic, explaining that he understands so little about threads that he's barely qualified to apply them. And apart from Rafael, there aren't too many alternatives.

  http://xrl.us/k2yd 

Backporting the new blead RE improvements to maint

Nicholas Clark posted a proof-of-concept update to re.pm to deliver Dave Mitchell's iterative (as opposed to recursive) implementation of the regular expression to perl 5.8.1 and beyond. A few show-stoppers need to be cleaned up: some coredumps in the test suite need to be sorted out and some tweaks to ppport.h are needed. As a bonus, Yves Orton's trie work comes along for the ride.

  More songs about building regexps
  http://xrl.us/k2ye 

valgrind and Perl 5

Nicholas Clark wondered what would happen, as in, how many bugs would be uncovered, if one were to run the test suite under valgrind. So Rafael Garcia-Suarez did just that, and discovered that 41 test files produce errors.

Nicholas and Rafael then set about fixing up the problems that were uncovered.

  There's always something to do
  http://xrl.us/k2yf 

Better reporting of TODO tests

Nicholas Clark looked at the unexpurgated version of the output from the test suite and noticed that six tests were marked as unexpectedly succeeding. In test parlance, these tests are called TODO tests, since they show what there is to do. This state of affairs is usually due to a test case that is expected to fail when run, since it exercised a bug in perl that needed to be fixed, and at some point, a source code change caused the failing test to succeed.

Nicholas saw that many of the really old regexp bugs that have been fixed, had no TODO tests, and in any event, the default, summarised, output of the test test suite makes no mention of them anyway, so it is not as if anyone would have noticed the improvement.

So firstly, the test harness had to be upgraded to report the summary of TODO tests that succeed, and (much more work) all the open bugs need test cases written for them, so that it becomes easier to see when they have been fixed.

Yves Orton hacked up his copy of Test::Harness to do this. Andy Lester took the idea and applied it to his development version of Test::Harness (see the "New Modules" section below).

Abe Timmerman updated the test smoke kit, in order to get all this new goodness into the hands of the smokers.

  Much ado about todo
  http://xrl.us/k2yg 
  Up in smoke
  rsync -avz source.test-smoke.org::ts-current .

Coverity coverage of CPAN modules

After having read the traffic on p5p concerning the errors that Coverity uncovered, Alan Olsen what the possibilities were for having the tests extended to cover CPAN modules with XS components.

Johnathon Stowe realised that it was the output of xsubpp that needs to be tested, rather than the .xs files themselves, and wondered whether all the possible constructs it is possible to have xsubpp emit ere in fact being covered, and whether one ought to create a dummy XS module that simply causes xsubpp to emit everything it knows how to.

Tim Jenness thought that that issue should be covered by XS::Typemap. Johnathon did some quick coverage calculations and was surprised to learn that it wasn't too shabby.

Andy Lester volunteered to liaise with the Coverity people to have XS-based modules analysed, should the authors in question care to know the results. Randy W. Sims was concerned that some authors might think of it as a ratings system. Be that as it may, a couple of authors asked for analysis to be applied to their modules.

  http://xrl.us/k2yh 


Patches of Interest

This week, Andy Lester performed some more op_type shrinking in sv.c and dump.c,

  http://xrl.us/k2yi 

and hauled some variables down into tighter scopes in util.c.

  http://xrl.us/k2yj 

212 warnings emitted by gcc-4.2

Marcus Holland-Moritz grew tired of watching an endless list of warnings spew from compiling perl with a recent copy of gcc, so he patched things to get rid of the problems that gave rise to them.

Andy Lester was pleased to hear of the work, since it had been something of an annoyance for him too. He asked for a slightly less monolithic patch, so that different classes of errors could be fixed a bit at a time. Rafael eventually applied all the changes.

  Understanding error messages
  http://xrl.us/k2yk 


Watching the smoke signals

Nicholas Clark looked at a NetBSD smoke report, and wondered what it was that was being tested in ext/B/t/bytecode.t that was failing. Whatever it was, he fixed it with change #27874.

  Smoke [5.9.4] 27855 FAIL(F) netbsd 3.0 (i386/1 cpu)
  http://xrl.us/k2ym 

Steve Peters wondered why a test run was failing, simply because TEST was seeing test results being delivered out of order, where as harness didn't care.

  Smoke [5.9.4] 27939 FAIL(F) MSWin32 WinXP/.Net SP2 (x86/2 cpu)
  http://xrl.us/k2yn 


New and old bugs from RT

op/cmp.t and lib/bigfltpm failures (#5708)

Steve Peters and Johnathon Stowe kicked this bug around, but as neither of them have access to the platform in question it shall have to remain open for the time being.

  OpenServer anyone?
  http://xrl.us/k2yo 

Regex replace loses characters (#24704)

Rafael fixed this bug by accident while working on something else. No-one minded.

  http://xrl.us/k2yp 

In fact, Steve Peters continued his thankless task of trawling through old, open tickets and noticed that a certain number of bugs had been solved by changes committed recently and not so recently.

  Fixed in previous millenium
  http://xrl.us/k2yq 

Sys::Syslog) requires \0 terminator in syslog messages (#28019)

Julian Mehnle called in from Debian-land to see what the status on this bug was, explaining that some comments or documentation would help avoid bugs being filed in the future.

  http://xrl.us/k2yr 

threads and require IO causes segmentation fault (#37076)

Nicholas Clark jotted down a couple of notes on how to fix this problem.

  Add it to the TODO
  http://xrl.us/k2ys 

Oxymoronic example in perlvar (#38743)

Steve Peters wondered why Dave's excellent example shouldn't be used to close this ticket.

  http://xrl.us/k2yt 

Text::ParseWords doesn't always handle backslashes correctly (#38904)

John Vromans argued that the following equivalency was incorrect:

  is_deeply([shellwords("aa bb cc\\ ")], ["aa", "bb", "cc "])

Alexey Toptygin delved into the code to find out why and offered a patch to make the behaviour a little more intuitive. Applied by Rafael.

  http://xrl.us/k2yu 

map sometimes uses only the last mapped value (#38935)

Someone on Perlmonks posted an innocuous question about some strange behaviour with map, that turned out to be a caused by a change that was applied in 1998. People were surprised at that such a bug had remained unnoticed for so long.

  http://xrl.us/k2yv 
  The original thread
  http://www.perlmonks.org/index.pl?node_id=543989 

Configure won't handle versions 5.10.0 or 5.8.10. (#38945)

Andy Dougherty filed a bug on this problem so that people remember to do something about it in time.

  http://xrl.us/k2yw 

Memory leak when calling system 1 foo repeatedly (#38946)

An interesting discussion arose from this report. It turns out that system 1, ... does something interesting under Windows.

  http://xrl.us/k2yx 

Tests fail in 5.8.8 if $TMP is not writable (#38947)

Gabor Szabo noted that certain tests lib/Memoize/t/tie_ndbm.t fail if the directory pointed to by $TMP was not writable. He felt that a diagnostic should explain more clearly what the problem is rather than failing out of hand.

  http://xrl.us/k2yy 

Migration Problem from Dynix to Aix (#38951)

Karuppiah Subramaniam has a migration problem. If you have any advice to offer, I'm sure it will be appreciated.

  http://xrl.us/k2yz 

exists error message on wrong argument type is incorrect (#38955)

Jeremy Hetzler wished to clarify the error message received when exists use incorrectly, and bring it into line with the documentation.

  http://xrl.us/k2y2 

File::Find documentation - is "Don't modify these variables" still valid? (#38965)

Steve Peters tweaked the documentation for File::Find to specify more clearly what happens to $_ in the callback routine.

  http://xrl.us/k2y3 

Perl5 Bug Summary

  9 created and 4 closed = 1543
  http://xrl.us/kw9y 
  Steady as she goes
  http://rt.perl.org/rt3/NoAuth/perl5/Overview.html 


New Core Modules

  • Test-Harness version 2.57_06, by Andy Lester. This enhances the summary result to indicate clearly the number of TODO test that have unexpectedly begun to succeed, (usually due to underlying bugs being fixed).

      http://xrl.us/k2y4 


In Brief

Nicholas Clark carried out his threat to document code references in @INC and source filters and also added a new feature at the same time.

  http://xrl.us/k2y5 

Paul Johnson read about the is_list_assignment speedup patch from Andy Lester, and pointed the porters to a two year old thread on a similar issue.

  http://xrl.us/k2y6 

Nick Ing-Simmons followed up on the issue of leaking file handles in XS code.

  http://xrl.us/k2y7 

Jan Dubois removed some cruft from makedef.pl

  http://xrl.us/k2y8 

Jarkko Hietaniemi tried a patch to regcomp.c to see if it would silence an error from Coverity. It didn't. This led Jarkko to conclude that if Coverity was too clever, or too stupid, to figure out what was really happening, then maybe it's Red-flag-for-Refactoring time.

  It would help us, frail humans
  http://xrl.us/k2y9 

He then nailed another leak that Coverity found in doop.c .

  http://xrl.us/k2za 

Nicholas Clark saw that Coverity dislikes PerlIO_findFILE . The logic seems a bit tortuous, so maybe that's not so surprising,

  http://xrl.us/k2zb 

Nicholas looked at the last two unreviewed Coverity issues, in regexec.c and wondered whether Coverity was getting confused. Dave Mitchell explained that both issues were false positives.

  http://xrl.us/k2zc 

Alex Waugh provided the required information to support compiling perl on RISC OS.

  http://xrl.us/k2zd 

Andy Lester posted a short script to prune Jarkko's cpd output, to show more clearly where Cut-And-Paste code was happening in areas that interested him.

  http://xrl.us/k2ze 

Yitzchak Scott-Thoennes fixed building perl on Cygwin .

  http://xrl.us/k2zf 

Joshua Juran uploaded an experimental release of Lamp on SourceForge.

  Lamp Ain't Mac POSIX
  http://xrl.us/k2zg 

Andy Lester refactored the excessive use of PM_GETRE() in pp_ctl.c .

  http://xrl.us/k2zh 

Jan Dubois and Steve Hay coordinated the ActiveState changes to win32/Makefile in blead, clearing up an issue concerning 64-bit environments at the same time

  http://xrl.us/k2zi 

Nicholas Clark explained what he understood Larry's MAD patch to be doing.

  http://xrl.us/k2zj 

The UTF-8 caching code that Nicholas Clark worked on a few months back wound up being exposed on the command-line via the -Ca switch.

  Unless someone has a better idea
  http://xrl.us/k2zk 

Nicholas Clark unearthed what is in hindsight a blindingly obvious memory leak on unthreaded builds between Perl_newCONSTSUB and cv_undef .

  Nobody else knew what to do about it, either.
  http://xrl.us/k2zm 

Andy Lester thought that GvUNIQUE() and its ilk could be removed from the source. Rafael commented that the macros had to remain, since at least Data::Alias on CPAN refer to them.

  http://xrl.us/k2zn 

Ashish Agarwal was having problems with weird characters displayed in the debugger. Joe McGuire thought it was probably one of the thirteen so-called variant characters in EBCDIC.

  \ [ ] { } ^ ~ ! # | $ @ `
  http://xrl.us/k2zo 

Andy Lester cleaned up regexec.c following on from the recent changes.

  http://xrl.us/k2zp 

Rick Delaney had discovered that fields.pm lost their compile-time benefit, dating back to when pseudo-hashes were removed from blead.

  http://xrl.us/k2zq 

Ken Williams asked for advice on some proposed File::Spec changes for VMS, John E. Malmberg supplied what information he could. Ken lamented how difficult it was to test VMS code if you didn't have access to a VMS box.

  http://xrl.us/k2zr 

Joshua ben Jore thought that the terribly cryptic select((select(OUTPUT_HANDLE), $| = 1)[0]) idiom should be banished from the documentation. Rafael bowed to reason.

  Just because you can
  http://xrl.us/k2zs 

The previous summary

The cynics scoffed at the effort expended to clear the Coverity issues, and Rafael pointed out that state variables are almost but not quite yet in blead.

  http://xrl.us/k2zt 

About this summary

This summary was written by David Landgren.

If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff:

  http://www.landgren.net/perl/ 

Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.

If you found this summary useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl.