This Week on perl5-porters - 15-21 May 2006

grinder on 2006-05-26T09:19:00

"Obviously, that's not supposed to happen. And just to make matters worse, it's deleted all the evidence" -- Andy Dougherty

Topics of Interest

The king is dead

After the 28220th change to the Perforce source repository, Nicholas Clark announced a snapshot for maint, whose main feature is the support for relocatable @INC paths. He mentioned that he had some 1200 patches queued up in his Inbox since October to examine for suitability for merging into maint. This would take several weeks, and then a few more weeks of release candidates, and then 5.8.9 would be released.

When that day comes, Nicholas said he would step down as pumpking, and that Dave Mitchell has volunteered to take over.

  Vive le pompe-roi
  http://xrl.us/mra5 

Building DynaLoader deletes the source tree

Joshua ben Jore was rather alarmed to discover that a recent change caused Dynaloader to delete the source tree in the process of being built, which puts a definite clamp on trying to test things afterwards.

Dominic Dunlop suspected that something was amiss with Joshua's source tree, since other smoke reports at the same patch level were not showing anything out of the ordinary. Andy Dougherty isolated a couple of suspect passages in the configuration run that deserved further attention.

Andy's analysis was correct. Joshua found that on Solaris (the platform in question), Dynaloader builds correctly with no threads, or with threads and the gcc compiler mentioned explicitly. Configure a build with threads but let Configure figure out implicitly that gcc should be used... and making Dynaloader will delete the source tree. Nice party trick.

Andy then determined the exact chain of events, and offered a course of action to those of great Configure-fu to stop this from occurring in the future.

  Dynaloader ate my homework
  http://xrl.us/mra6 

All this made Sébastien Aperghis-Tramoni notice that the test suite lacks a specific test script for Dynaloader, so he remedied the situation.

A couple of Sébastien's were marked TODO, since can_ok() seemed to have a bit of trouble with Autoloader's autoloaded functions. chromatic briefly explained how to fix it, but Rafael Garcia-Suarez wasn't sure whether he thought it was the right way, so chromatic elaborated on the concept and afterwards Rafael did.

  Schwern loses a nickel
  http://xrl.us/mra7 

chromatic submitted a patch that fixed it all up. Rafael was about to commit it when he realised that the patch used Scalar::Util's blessed() function, but in the context of building the core, it probably hasn't yet been, or may never be, built. So in the end the expedient measure of using ref() was used instead.

  Easier than rearranging the build
  http://xrl.us/mra8 

The question of Scalar::Util not being built in turn reminded Randy W. Sims that he had discovered that the latest version of Ubuntu linux ships without the XS version of Scalar::Util, which has the unfortunate side-effect of breaking svk.

Dave Rolsky thought that having an XS version and a pure-Perl version of the same module but with different feature sets was madness. The fact that weaken() only comes with the XS version is a pain.

  Now you get it, then you don't
  http://xrl.us/mra9 

The right hints for Configure

After having mulled over bug #39149, Dominic Dunlop thought that the message that Configure prints out to explain what hints to use, was probably a bit confusing. To confuse the summariser, H.Merijn Brand explained that he had a single Policy.sh file that he uses an all sorts of platforms, from HP-UX to AIX to Cygwin.

After a bout of archaeological prospecting, Dominic discovered one hints file, greenhill.sh, that looks as if it is to be used in conjunction with another primary hints file, and commented that it is probably thoroughly unused as well. This caused Andy Dougherty to reminisce about the old days. He also gave a clear explanation about the purpose, and usefulness, of hints. They are a hack to give Configure a sharp poke in the eye to do something quick and dirty, and this saves you considerable time, since you don't have to delve into its guts to make it do the right thing in a nice cross-platform manner.

At the end of the day, a couple of documentation patches made Configure's intent clearer.

  http://xrl.us/mrba 

Implementing improvements to improve implementations

Randal L. Schwartz thought he was confused about Attribute::Handlers, when in fact he was confused by CHECK and INIT blocks not firing on require statements and said that he thought the implementation had a couple of holes in its feature matrix.

This lead Nicholas to conclude, and it bears repeating in full here:

  • I infer that this is because the people/organisations that need the functionality don't have the time/skills to provide the patch in house, and the people who do have the skills to create such a patch don't have the time or the personal need. This seems to be a general problem with Perl 5 development - there are a lot of firms using Perl to make money (that's fine - that's the idea) but no effective way of pooling resources from those firms back into supporting core development, with the upshot that core development and support is purely done by volunteers on a "best-effort" basis.

At least this time there was a bit more of a discussion. TPF got a mention, and there was a bit of grumbling about how Perl 6 seems to be grabbing the spotlight even though it's still just a research project, whereas Perl 5 is here and now, not dead, no, definitely alive and kicking.

And Merlyn, John Peacock and Joshua ben Jore discussed the problem of require, CHECK and INIT blocks.

  Need to get 5.10 out the door
  http://xrl.us/mrbb 

documenting %^H and lexical pragmas

Rafael Garcia-Suarez had thought that the %^H section in perlvar would be a suitable place to deal with documenting the new user-level lexical pragmata. Nicholas Clark looked at the existing text and concluded that the best thing to do would be to start again with a clean slate.

Yitzchak Scott-Thoennes side-stepped the issue, and suggested that perlpragma.pod would be an even better place to document all this. Since no-one else came up with anything suitable to get the ball rolling, Nicholas Clark landed a first cut.

  use reason;
  http://xrl.us/mrbc 

perlapio and PerlIO_binmode()

Matthew Byng-Maddick was having trouble marrying the output from a truss/strace-type program with Devel::DProf. He wanted to be able to see exactly where system calls were coming from. His attempts to observe were interfering with what he was trying to measure.

In the process of trying to get the thing to work in a reasonable manner he discovered some inconsistencies in the documentation and asked for advice.

  Warnocked ye were, and Warnocked ye be
  http://xrl.us/mrbd 

Perl_PerlIO_context_layers() and PerlIO_apply_layers()

In other PerlIO news, Yves Orton said that he was having trouble with building recent bleads, and poked and prodded at the code, and managed to get it into a reasonably sane state, albeit with some odd failures in the test suite.

Rafael and Steve Hay twiddled a few dials on the big machine and eventually all the errors went away.

  http://xrl.us/mrbe 

Performance in regular expressions

H.Merijn Brand said that at the Dutch Perl Workshop, Juerd and he talked about the fact that /[x]/ is not optimised to /x/, but that sometimes the character class matches faster than the literal, which seems counter-intuitive.

Yves Orton explained why things were the way they were, and in this particular case, it was apparently blind luck as much as anything else. Yves was interested in adding a single character class to literal conversion in the compiler, since character classes cause the new trie code to be skipped, and that would give more patterns a chance to be trie'd.

  Promise of a classless society
  http://xrl.us/mrbf 

optimize /[x]/ to /x/

So Yves figured out how to get the compiler to do just that and bundled it up into a shiny patch and tests which were applied by Dave. The good thing about this is that it appears that there is now a third person, along with Dave and Hugo van der Sanden, who can do battle with the C code of the regexp engine... and emerge victorious.

  Hairy C code 0, Yves 1
  http://xrl.us/mrbg 

Exploring userelocatableinc

Nicholas wrapped up the support for relocating @INC.

  http://xrl.us/mrbh 

Later on, Marcus Holland-Moritz discovered that $Config{startperl} is wrong if userelocatableinc is undefined. Nicholas was thrilled, as it meant that all seven people who had downloaded the latest maint snapshot (see above) had not tested it. But he fixed it anyway.

H.Merijn wondered if it should be included in his smoke configuration.

  http://xrl.us/mrbi 

The continuing threads saga

David Nicol mapped out a mechanism for linked-list stacks and queues, in the context of last week's "delivering signals to threads" thread. This received no discussion, I think because the point that Dave was trying to make initially was that no one would want to have to have this sort of machinery in the first place.

  http://xrl.us/mrbj 

Jerry found time to craft a patch to bring blead up to threads version 1.28, and this was applied by Rafael.

  http://xrl.us/mrbk 

Jerry wondered if threads in BEGIN blocks were safe to use. The documentation says <blink>Don't Do That</blink>, but apparently it seems to work just fine.

  http://xrl.us/mrbm 

Jerry then discovered why creating threads in BEGIN leads to Attempt to free unreferenced scalar warning errors, and suggested a one-line fix that would solve the problem, but wanted to know whether this would produce any unwanted side effects, playing around with reference counting as it does.

  Mmmm... dunno
  http://xrl.us/mrbn 

Since no one could think of any possible harm that Jerry's suggestion could cause, he crafted another patch to fix the problem, and Dave applied it.

  Gentlemen, begin your threads
  http://xrl.us/mrbo 

Jerry then finished up adding an explicit thread context mechanism, which Rafael also applied.

  http://xrl.us/mrbp 

Dual-lifed modules that give CPAN grief

Peter Scott remarked that Devel::Peek's version number was higher in core than CPAN, and that this caused problems when upgrading CPAN. Nicholas suggested Peter contact Ilya Zakharevich, the author, directly. Rafael wondered whether it made any sense to dual-life the module at all, since it tends to be tied quite intimately with the internals. Peter said that Ilya said that the problem was with CPAN.pm.

  http://xrl.us/mrbq 

Peter also found that Data::Dumper, Devel::Dprof and Filter do not configure themselves correctly, which causes them to be installed under site_perl instead of the core directories.

  http://xrl.us/mrbr 


Patches of Interest

my_snprintf

Following on from the discussion last week, where Nicholas Clark opined that it would be good to probe for variadic macro support, and use them if available, it just so happens that Configure was tweaked to do just that.

So Jarkko Hietaniemi redid his patch to take this into account and threw in a number of safety checks at the same time.

  Better and better
  http://xrl.us/mrbs 

Strange encodings upsets pp_chr

The subject of this item should be in the past tense, since Sadahiro Tomoyuki worked on the matter, and sent in a patch to make pp_chr happy. As a bonus, associated test scripts were made EBCDIC-friendly. This in turn made Rafael happy.

  http://xrl.us/mrbt 

sv_pos_b2u dislikes the extended UTF-8

Tomoyuki also fixed up sv_pos_b2u_forwards to behave more responsibly in the face characters residing in Perl's UTF-8 extension space (by avoiding an expensive function call merely to figure out a length). He then noticed that S_sv_pos_b2u_forwards looks it does the same thing as the public Perl_utf8_length function, and wondered if the latter should not be used instead.

Carrying on in this one-person thread Tomoyuki decided the current approach was a complete mess (indeed the C comments scream out about needing to be fixed). So he fixed it.

  But not yet applied
  http://xrl.us/mrbu 

Andy Lester looked at S_bytes_to_uni and noticed that it could be made context-free and tidied up an unused variable in Perl_refcounted_he_fetch. Applied by Rafael.

  http://xrl.us/mrbv 

S_reguni should return its length

Elsewhere, Andy thought that it was rather silly of S_reguni to return its length via a pointer to an integer, and that returning the value on the stack would make the intent a lot clearer. Agreed to and applied by Rafael.

  http://xrl.us/mrbw 

Signature change of SvVOK()

John Peacock sat up in surprise after stumbling across a patch committed by Nicholas back in January, that changed the signature of SvVOK(). The idea was to change from returning 0 or 1, to 0 or valid-pointer, which in turn cuts down on needless mg_find calls.

As John has to mimic this behaviour in version.pm, he was hoping for a little moral support on the issue. Support was freely given, and appeared to consist of an inordinate amount of tweaks to header files and Devel::PPPort to get just right.

  Asleep at the wheel
  http://xrl.us/mrbx 

No more S_regoptail

Andy Lester noticed that S_regoptail is called but once in regcomp.c, so he inlined it, which in turn meant that the code that called it was also able to be simplified further. Applied.

  Cascading goodness
  http://xrl.us/mrby 

Andy then undertook some refactoring of reghops, but this was not applied, despite the fact that the patch featured genuine parameter consting.

  Not enough goodness
  http://xrl.us/mrbz 

He finally attempted a pp_sys cleanup, but following the discovery that there are no tests in the test suite that actually exercise the code paths in question, Andy pulled it back onto bench to take another look.

  http://xrl.us/mrb2 

After a revision, the second time around things looked much better.

  http://xrl.us/mrb3 

Jarkko was horrified when he realised that his recent strlcat work was bogus, goofy and overkill, although probably not exactly dangerous. Steve Peters admitted that some of the blame was his own.

  http://xrl.us/mrb4 


Watching the smoke signals

Smoke [5.8.8] 28211 FAIL(XM) MSWin32 WinXP/.Net SP2 (x86/2 cpu)

Something went wrong during configuration, so Nicholas fixed that. Other things were going wrong too, but appeared to fix themselves autonomously.

  Just one of those things, I guess.
  http://xrl.us/mrb5 


New and old bugs from RT

What Steve Peters did this week

Noted that the desire that CGI multipart should support nph parameters (#24542) had been met with CGI version 3.05.

  http://xrl.us/mrb6 

Realised that the fact that submit() of CGI.pm generates warning if -sticky used (#24760) was no longer true, at least as of CGI version 3.20.

  http://xrl.us/mrb7 

Pointed out that no longer does CGI.pm autoloading lose $@ (#30325), thereby closing a third CGI issue.

  http://xrl.us/mrb8 

Renamed a file because a test case name was too long (#38645), which should make Stratus VOS users happy.

  Shorter is better
  http://xrl.us/mrb9 

SEGV with complicated regexp and long string (#32041)

was resolved by Dave Mitchell, who fixed up an integer overflow negative wrap-around bug.

  http://xrl.us/mrca 

Perl segfaults; test case available (#32332)

was also resolved by Dave Mitchell, this time adding the required make-work code to keep reference counting happy.

  http://xrl.us/mrcb 

many threads leads to various crashes (#37652)

Jerry D. Hedden remarked that the biggest problem with the example code in this bug report was that it spawned threads so fast and furiously, that perl never had a chance to catch its breath and do the required housekeeping, so it was little wonder that it ran out of memory.

Adding a brief sleep to the script seemed to help it considerably, but even then there's still a bit of a resource leak on Windows that will eventually take out the program, after some two million threads have been created.

  Take a short nap
  http://xrl.us/mrcc 

Problems building on Solaris 8 (#38664)

Andy Dougherty followed up on this bug, offering some tips on a healthy configuration specification.

  Get it in writing
  http://xrl.us/mrcd 

SvPOK breaks scalar magic in 5.8.x (#38707)

Dave Mitchell could not figure out how mere bit-testing macros could interfere with magic, and asked for more code, guessing that the problem was really elsewhere. Craig DeForest said he'd try and come up with a small test case.

  http://xrl.us/mrce 

Threads calling LWP causes exception (#38712)

Dave Mitchell suggested taking this up with the LWP team, since LWP isn't in the core.

  Unsafe unless proven otherwise
  http://xrl.us/mrcf 

Regexp optimizer loses its hopes too soon (#39096)

Dave Mitchell and Mike Guy followed up on this thread, that shows how two out of three seemingly identical regular expressions are dispatched by the engine with utmost speed, but the third get dragged down into a mess of exponential back-tracking.

It would appear that there is scope within engine to identify the third expression as equivalent, however, Dave didn't wish to commit to a date as to when that might occur.

  Nested parens bad, m'kay?
  http://xrl.us/mrcg 

sprintf with UTF-8 format string and ISO-8859-1 variables redux (#39126)

Sadahiro Tomoyuki took a closer look at this problem. Firstly, he managed to produce a small test case that provoked the bug. Secondly, this allowed him to narrow the offending code down to a section in Perl_sv_vcatpvfn.

Unfortunately, the solution wasn't obvious, apparently one more problem relating to the disconnect between bytes and characters. Fortunately, he was able to cook up an appropriate patch, and as an added bonus, provided a test that exercises the problem in both ASCII and EBCDIC character sets.

  http://xrl.us/mrch 

failure not always detected in IPC::Open2::open2 (#39127)

A lengthy thread developed on this, as Steve Peters tried to explain how things work from Unix's and Perl's point of view and Vincent Lefevre tried to explain how things were not working from his point of view. At the end of the week, no agreement had been reached.

  You just have to wait
  http://xrl.us/mrci 

h2ph generates incorrect code for #if defined A|| defined B (#39130)

Rafael applied the suggested patch to blead and suggested that Nicholas do as much for maint.

The thread then segued into the observation that you can actually stuff just about anything into a perl AV array slot. Jan Dubois confirmed that this was true, but worked only as long as you accessed the contents within XS. Try to do as much in Perl code and the hammer comes down, smashing your program into tiny pieces.

  Just because you can, doesn't mean you can
  http://xrl.us/mrcj 

Lots of warnings with diagnostics and (warn or die) (#39141)

Fitz Elliott noted that a bare warn "\n" spews large amounts of Use of uninitialized value in substitution warnings, and suggested a fix. Dave Mitchell used a slightly different technique than Fitz's to patch diagnostics.pm.

  You MUST believe the error message
  http://xrl.us/mrck 

Unable to make Perl 5.8.8 on HP-UX 11.11 (#39143)

Jim Duffield continued to make little progress in getting 5.8.8 to work to his satisfaction on HP-UX. As the goal was to be able to use perlcc, Joshua ben Jore suggested using PAR instead, which is probably the best solution.

  http://xrl.us/mrcm 

Win32, @_ and fork crashing in dounwind (#39145)

Brad Bowman showed that sub { @_ = 3; fork ? die 5 : die 6 }-E<gt>(2) gives Win32 considerable pain. Steve Hay was able to reproduce it on Win32 in blead, but wondered if anyone in Unix-land was able to do the same.

It boils down to a problem with the way fork is emulated on Win32 through a lot of code here that simply never gets exercised on Unix. Jan Dubois pointed to a little known PERL_SYNC_FORK trick that could be used to serialise the fork executions, although it probably hasn't been used in the past five years, and may have suffered bitrot.

Dave Mitchell took a wild shot in the dark, Steve Hay tried the suggestion, and as usual, Dave had called the play correctly.

  http://xrl.us/mrcn 

Perl 5.8.8 configure failure (#39149)

Scott McAskill was having trouble configuring Perl on an aging Tru64 machine. Andy Dougherty, despite knowing next to nothing about that platform nonetheless was able to provide enough information to help Scott get up and running. Ideally there's something that should be tweaked in the hints file, but for the time being it looks like the problem was solved.

  http://xrl.us/mrco 

diagnostics.pm: -traceonly vs -trace (#39152)

Julian Mehnle was puzzled by a discrepancy in the documentation, and had to read the source to figure out what was really going on. He thought that the best thing to do was to correct the documentation, so that someone else would not fall into the same trap.

James Mastros suggested that the optimal solution would be to align the code with the documentation, in a way that was both backwards and forwards compatible. Fergal Daly admitted to being the guilty party responsible for the problem in the first place, and cooked up a patch that followed James's suggestion. Applied by Rafael.

  http://xrl.us/mrcp 

Segmentation fault on simple regexp with string larger than 29kB (#39167)

Krzysztof Leszczynski isolated an innocuous regular expression in the YAML distribution that blows the stack on a sufficiently long string. Dave Mitchell and Dominic Dunlop explained the story of Perl's recursive-but-now-iterative regular expression engine.

  One more reason
  http://xrl.us/mrcq 

Do not recommend Switch.pm in perlfaq (#39170)

Slaven Rezic thought that the FAQ entry concerning how to write a switch statement à la C should not mention the Switch, (due to weird syntax errors it can introduce into otherwise sane code, because of its source filter nature). He wanted to point out that in 5.10 one will be able to use the perl6-ish given/when construct.

Abigail thought it was pretty silly to recommend this latter point, since there is no firm date available as to when 5.10 will ship.

  Hopefully sooner rather than later
  http://xrl.us/mrcr 

Perl5 Bug Summary

  http://rt.perl.org/rt3/NoAuth/perl5/Overview.html 


New Core Modules

  • IO::Compress::* version 2.000_12 proposed by Paul Marquess and accepted by Steve Peters.

      f y cn rd ths, y nd t gt lf
      http://xrl.us/mrcs 
  • version version 0.60 from John Peacock syncs CPAN with blead.

      http://xrl.us/mrct 

    And gets it working even betterer than before.

      http://xrl.us/mrcu 


In Brief

Nick Ing-Simmons provided a thoughtful follow-up to the question of whether a FileHandle is IO::Seekable ?

  http://xrl.us/mrcv 

Nicholas Clark confirmed, following on from the internal error in Bytecode.pm bug report (#39110), that Bytecode is indeed unsupported, since none of the (volunteer) core developers use this experimental module in the normal course of events. It is thus unlikely to receive any attention in the near future.

  Any itchiness will remain unscratched
  http://xrl.us/mrcw 

Jerry D. Hedden reported that he had be using the reordered SV flags for a few months now, with no ill effect.

  But they ain't maint compatible
  http://xrl.us/mrcx 

Joshua ben Jore landed a large set of shiny B::Lint changes, saying they were good enough for blead.

  Believed to be maint compatible
  http://xrl.us/mrcy 

Dave Mitchell thought that change #28183 had broken 64-bit builds. Jarkko Hietaniemi managed to flog off a patch on the cheap to fix it up, but the after sale service nearly drove him round the bend.

  http://xrl.us/mrcz 

Scott Carroll wanted to know more about Storable's license and copyright status.

  This program is free software
  http://xrl.us/mrc2 

Jakob Bjeggaard had a question about Data::Dumper not dumping a blessed object correctly. Yves Orton explained that it cannot really hope to be able to dump an inside-out or an XS-defined object correctly. Such objects need to provide their own freeze/thaw methods to do this properly.

  http://xrl.us/mrc3 

The Perforce server downtime should always be arranged to coincide with London Perl Monger meetings.

  http://xrl.us/mrc4 

Dave Mitchell made Devel::Peek dump LVs and GVs, following on from the big SV internals restructuring a while back.

  http://xrl.us/mrc5 

He also saw that assigning whole (hash|array) to a tied (hash|array) doesn't mangle SvTYPE , at least, not in blead.

  http://xrl.us/mrc6 

And explained what exactly DEBUG_LEAKING_SCALARS does, and why you might want to use it.

  http://xrl.us/mrc7 

Daniel Frederick Crisman suggested a way to restructure the quote-like operators section in perlop . The patch appeared to move a lot of stuff around, which may explain why people's eyes glazed over.

  The curse of Warnock
  http://xrl.us/mrc8 

Yves fiddled with win32/buildext.pl to handle inclusions and not just exclusions, in order to minimise the number of extensions that were built needlessly while he was performing open heart surgery on the core. He wasn't particularly insistent about having it applied, but Steve Peters did so anyway.

  http://xrl.us/mrc9 

Last week's summary

I got the part about chromatic's sv_derived_from blues wrong. It is code that calls UNIVERSAL::isa() and UNIVERSAL::can() directly as functions that breaks things.

  http://xrl.us/mrda 

About this summary

This summary was written by David Landgren.

If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff:

  http://www.landgren.net/perl/ 

Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.

If you found this summary useful or enjoyable, please consider offering Nicholas Clark a job. A nice one, with a swivel chair.