This Week on perl5-porters - 1-7 May 2006

grinder on 2006-05-12T12:35:00

Let's change perl's internal hash function, shall we?

Topics of Interest

The Win32/Wince leveraged buy-out merger

Vadim reminded the patch-meisters, (and clearly the man has been revising his classics) to apply the large Win32 patch that Yves Orton had posted, so that they could continue the process.

  Smelly socks!
  http://xrl.us/mekd 

Which must have been applied, for a little while later Vadim posted further tweaks to the Wince side of things.

  One less source file!
  http://xrl.us/meke 

As well as some Makefile.ce goodness

  http://xrl.us/mekf 

state variables featured in blead

Rafael Garcia-Suarez ironed out some of the bigger show-stoppers in his state variable work, and was thus able to commit the first draft to blead. Arrays and hashes aren't done yet, but as of now, we have an elegant solution to the perennial my $var if 0 problem.

Nicholas Clark was impressed, and tried out a snippet of code, that Rafael turned into a test for a brand new t/op/state.pl.

  Coup d'état
  http://xrl.us/mekg 

Is socketpair available everywhere?

Jan Dubois was looking at the section in perlport where it says that socketpair is not available on Win32 and a number of other platforms. He thought that it was currently emulated on all platforms that did not provide a native implementation. If so, it would mean that the section could be quietly dropped.

Alex Waugh mentioned that it's available natively on RISC OS, so if nothing else, that could be removed from the list.

  Going native
  http://xrl.us/mekh 

Support for inside-out classes

Anno Siegel delivered a two part patch to improve the implementation of inside-out classes. Part one deals with perl itself, and the second part reaches out into the core library and expands Hash::Util. Anno also provided a demo to show how it worked.

Yves Orton was quite enthusiastic, Randy W. Sims, markedly less. David Golden had a close look at the patch and explained clearly what he thought was going on, and pointed out some issues where the implementation was a bit rough around the edges.

Yuval Kogman thought the patch was maybe not general enough, and suggested some avenues to explore, other than just Inside-Out objects. David Golden and Abigail discussed issues such as backwards compatibility at length.

Rafael announced that he was happy with the intent of the patch. Nicholas Clark caught an error that Anno set about correcting.

  http://xrl.us/meki 

Obligingly, Anno summarised where he was going from here with the patch. In the end he retired the patch and set about doing something better.

  http://xrl.us/mekj 
  "U" magic in February
  http://xrl.us/j63s 

Hashing algorithms

Jarkko Hietaniemi was looking at hash functions, and stumbled across a page full of them. Perl uses the "One-at-a-Time" algorithm. Except that we don't, although the comments say we do. Except, if you have enough coffee and look again, we really do.

Yves Orton benchmarked a series of hashing algorithms to see what they did for perl. At a first glance, it looked like "One-at-a-Time" was rather slow, especially in comparison to another hashing function named (surprise!) SuperFastHash.

Much of the improvement is dependent on the length of the hash keys. Short keys show little difference, long keys can show spectacular improvements.

Michael Schroeder weighed in with Buzhash, which, at the expense of a table of 256 ints, boils down to a rotate and xor per character.

  Total Hash Controversy
  http://xrl.us/mekk 

Don't kill select((select(OUTPUT_HANDLE), $| = 1)[0])

Abigail made an impassioned plea to keep this tricky construct in the documentation. The main point being that it would be of use to maintenance programmers, not necessarily for people starting out. A few people argued the issue back and forth, and Abigail maintained that a three line snippet would be preferable, since people could pick it apart and thus learn more easily about the particular parts they are unsure of.

  http://xrl.us/mekm 


Patches of Interest

threads patches

Jerry D. Hedden sent in a new version of a patch to threads, this time with much less drastic whitespace reformatting. This time around it was applied.

  http://xrl.us/mekn 

... and another to add stack size support, which was the main reason why Jerry started to work on all of this in the first place. Applied as well.

  http://xrl.us/meko 

What Andy Lester did this week

Andy Lester forwarded a patch last week to tidy up one-line loops, specifically, bringing the lone semi colon down to the next line and using a NOOP macro. Said macro can then be used to squirrel away the various comments that use secret handshakes to keep source analysis tools happy.

Rafael hated it. Nicholas Clark proposed a semantically equivalent alternative that met with the Andy seal of approval, and so Andy went back to refactor using the new approach.

  It's overkill of course
  http://xrl.us/mekp 

He came back after a while with new versions of doio.c and dump.c and wanted to know what people thought of them. Rafael must have liked it, since he applied the patch.

  http://xrl.us/mekq 

Along with a nice micro-optimisation for S_find_array_subscript.

  http://xrl.us/mekr 

Andy then thought he could remove a goto in a section of code in op.c, since the only thing it does is jump over a single if block. The concept was perhaps sufficiently scary for no-one to dare apply it to see what would happen.

  http://xrl.us/meks 

What Jarkko Hietaniemi did this week

Jarkko tweaked hv.c to use a safer approach to performing the task of zeroing out memory.

  Know your macros
  http://xrl.us/mekt 

And cast his eyes upon pp_sys.c.

  http://xrl.us/meku 

And a tweak to shave some memory off the size of a microperl.

  Smaller is better
  http://xrl.us/mekv 

Jarkko also added some enhancements to the PERL_MEM_LOG infrastructure, that allows very detailed logging of memory allocations. Jarkko hoped that someone would feel sufficiently inspired to write some code to munge the output and produce some pretty pictures to help gather a better understanding of allocation patterns.

  Your name in lights
  http://xrl.us/mekw 

Various t/op/* tests are reviewed

...by yours truly, and patches applied by Rafael.

Historians may find it of interest that the patch to context.t was the first ever since its inclusion, following a bug report from François Désarmenien nearly six years ago.

  t/op/context.t
  http://xrl.us/mekx 
  t/op/grep.t
  http://xrl.us/meky 
  t/op/list.t
  http://xrl.us/mekz 

And in other news, your summariser ran into a minor problem of resource acquisition (not enough semaphores available) to run a couple of tests in blead's suite, and set patches to fix that up as well.

  ext/IPC/SysV/t/ipcsysv.t
  http://xrl.us/mek2 
  ext/IPC/SysV/t/sem.t
  http://xrl.us/mek3 

In doing so, another minor problem arose with fold_constants JMPENV_PUSH panics. Dave Mitchell sorted this out, but you'll have to wait for next week's exciting episode to find out how (this is due to a severe warp in the space-time continuum, in that the summary is really late this week).

  That's not supposed to happen
  http://xrl.us/mek4 


Watching the smoke signals

Smoke [5.9.4] 28069 FAIL(M) MSWin32 WinXP/.Net SP2 (x86/2 cpu)

The recent work done by Yves and Vadim blew away a couple of assumptions. Sadahiro Tomoyuki identified the source of the problem. A brief flurry of patches had it fixed up quickly.

  http://xrl.us/mek5 

Smoke [5.9.4] 28108 FAIL(XM) MSWin32 WinXP/.Net SP2 (x86/2 cpu)

Another problem showed up, this time relating to threads.

  http://xrl.us/mek6 

Smoke [5.8.8] 28115 FAIL(M) MSWin32 WinXP/.Net SP2 (x86/2 cpu)

And a final one, where perl was built and yet managed to forget to link in a Perl_* function, because the export list for the linker had become messed up. Sorted out by Steve Hay and Nicholas Clark.

  http://xrl.us/mek7 


New and old bugs from RT

Change 22258 causes test failures on AFS (#38698)

A recent change to a ext/IO/t/io_unix.t to permit a fall-back to /tmp if socket creation fails in the current directory causes grief on AFS file systems. People tried to figure out how to work around this added twist. Mike Guy wondered it it wouldn't be easier to figure out first why the socket was failing in the current directory in the first place, then maybe people wouldn't have to deal with this new case.

  http://xrl.us/mek8 

rcatline doesn't stringify references (#39037)

  and maybe it should
  http://xrl.us/mek9 

5.8.8 lib/ExtUtils/t failures (#39055)

pal@hp sent in some patches to make ExtUtils more robust in the face of super-fast hardware.

  http://xrl.us/mema 

readline the last line with no newline (#39060)

Mark Martinec observed that when the last line of a file lacked a newline, he would observe Bad file descriptor errors, and supplied some code to show how to reproduce the problem

  http://xrl.us/memb 

Data::Dumper and numeric scalars (#39062)

Brad Baxter noticed that Data::Dumper would sometimes quote a scalar that would otherwise not need quotes, as it was numeric. Unfortunately, he wasn't able to pin the change in behaviour down any more accurately than between 5.6.1 and 5.8.7.

  http://xrl.us/memc 

non-portable static libraries on AMD64 Linux (#39068)

This was a problem related to linking with -fpic as opposed to -fPIC. Debate raged over how to get it sorted out in Configure.

  http://xrl.us/memd 

Choosing vendor install paths (#39069)

A similar problem, for which the build infrastructure offers a couple of alternatives.

  http://xrl.us/memf 

Perl locale disagrees with Linux sort (#39087)

When locales are in use, Perl doesn't sort the same way as the GNU sort program in Linux, nor PostgreSQL Sadahiro Tomoyuki explained that perl isn't making this stuff up as it goes along, merely using whatever C's strxfrm function cares to return.

  So what are the others using?
  http://xrl.us/memg 

Overloading Regexp and infinite recursion => SEGV (#39090)

Andreas Koenig showed how overloading "" and adding a dash of qr// into the mix can produce a core dump.

  http://xrl.us/memh 

deprecated $# treats 0 specially (#39097)

Ruud Affijn showed what strange and wonderful things can happen when you embed a newline in the #$ variable. Which you shouldn't be doing anyway as its usage is deprecated after all. Andy Dougherty traced the behaviour all the way back to perl 1.010.

  Good novelty value
  http://xrl.us/memi 

Pod::Html error stops CPAN install/test of Pod::Readme (#39098)

This bug highlighted the issue of what to do when installing a POD file as a man page, and the POD is malformed. Should the file be skipped? Should the installation be aborted?

  Halt and Catch Fire
  http://xrl.us/memj 

Perl5 Bug Summary

  14 opened + 18 closed = 1523
  http://xrl.us/memk 


New Core Modules


In Brief

Peter Dintelmann followed up on the thread last week about length specifiers in unpack 's mini-language and came up with a documentation patch, first for maint, and then for blead.

  http://xrl.us/memn 

Since ${^WIN32_SLOPPY_STAT} is in blead , Rafael called for some documentation, to explain what it does. Jan Dubois complied.

  http://xrl.us/memo 

By the same token, Jan also clarified hard link support on Windows.

  http://xrl.us/memp 

Mohammad Yaseen had some problems building Perl modules in non-standard locations. There was the usual chorus of advice about PREFIX and PERL5LIB but the really interesting news from the thread was that Randy W. Sims mentioned that Module::Build now (as in 0.28) supports ExtUtils::MakeMaker's PREFIX functionality.

  http://xrl.us/memq 

Things seemed to be moving along for Mohammad. After having successfully built perl on z/OS, he was having trouble dealing with the semantics of environment variables. Dominic Dunlop guided him through the minefield.

  http://xrl.us/memr 

Dominic requested more information about Perl on Windows 98. Even the original poster thought that it probably wasn't worth the effort to pursue any further.

  http://xrl.us/mems 

A few more data points on the issue of taint and fork on Win32, courtesy Perlmonks.

  http://xrl.us/memt 

Paul Johnson returned to Adam Kennedy's question about how to tell whether a file handle is seekable, as he had run into pretty much the same issue with IO::Compress:Zip. In the end he simply documented that trying to do X with Y just does not work.

  http://xrl.us/memu 

Shlomi Fish found a Heisenbug with perl -d . Dave Mitchell confirmed that it is fixed in blead. Andreas Koenig confirmed that it was fixed with a Heisenpatch.

  Now you see it. Umm...
  http://xrl.us/memv 

Ash Berlin spotted a couple of places in perlop.pod with incorrect L<...> markup.

  Try writing the above sentence in POD yourself
  http://xrl.us/memw 

Nicholas Clark discovered that GCC does not conform to the C standard, section 6.8.6.4. Marcus Holland-Moritz comforted him with the idea the Intel compiler gets it wrong too.

  A void hate
  http://xrl.us/memx 

About this summary

This summary was written by David Landgren. My humble apologies for the tardiness, Real Life kept me busy this week.

If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff:

  http://www.landgren.net/perl/ 

Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.

If you found this summary useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl.