This week's p5p summary is going to be a bit unusual : a few very long threads will be summarized (logically) in longer paragraphs. Read about hashing algorithm vulnerabilities, new proposed syntax, CHECK and INIT blocks, and other unlittle things.
The discussion that started last week about the possibility of algorithmic attacks on perl hashes (bug #22371) found its conclusion. Jarkko Hietaniemi proposed several successive iterations of a patch.
Here's the final state : unless the -DUSE_HASH_SEED_EXPLICIT
or
-DNO_HASH_SEED
preprocessor symbols are given to the Configure script,
perl's internal hashing algorithm will use a random start value (called
the hash seed). This value can be overriden via the PERL_HASH_SEED
environment variable. That means that if you have a program that uses a
hash, with a constant set of keys, inserted in the same order, the
ordering of the output of keys()
will different between different
executions of your program.
At some point Nick Ing-Simmons suggested to introduce a new attribute to
declare ``repeatable'' hashes
my %hash : predictable;
or
my %hash : lock_order;
Jarkko made a failed attempt at implementing it, but as anyway people seemed to dislike it, or to not like it enough (finding it overkill), it wasn't kept.
The thread starts at http://xrl.us/j7m -- and it counts more than 100 messages.
Now what would be interesting is to have some benchmark results, on different platforms. Using SpamAssassin (per Nicholas Clark's suggestion) seems to be a good idea.
Jarkko finally applied his randomized hash seed patch, and two failures appeared randomly in the smoke test reports. The first one was caused by a piece of code that used to iterate wrongly over the keys of readonly hashes. The second one, in a test for Encode::Alias, was more difficult to hunt down. It was caused by an inconsistent definition of an encoding name alias, making the test effectively dependent on hash key ordering.
Jarkko proposed a new syntax : the ability to declare a hash and
initialize a slice of it at the same time
my @hash{@keys} = (values...);
Mark-Jason Dominus approves, but wonders why my($h{$k})
shouldn't be
accepted either. Rafael Garcia-Suarez proposes a one-line proof-of-concept
patch to allow the syntax my(@hash{qw/keys/})
(with parentheses --
otherwise modifying the parser is necessary). Hugo van der Sanden worries
that this may mislead users in expecting that local @hash{@keys}
works
analogously, by emptying %hash first, and thus pronounces himself against
this idea.
http://xrl.us/ks4
Marc Lehmann reports (bug filed under number 22826) that neither CHECK
nor INIT blocks are executed when require() is used to load a module.
Rafael provides the explanation (and sheds some light on the unclear
points of the documentation) : CHECK and INIT blocks are executed during
the transition between the compilation phase of the main program and its
execution. Thus, they're executed at only one point during the lifetime
of the perl interpreter, and once the main program has been compiled,
they're never executed again. (This explains the existence of the warning
Too late to run CHECK/INIT block
.)
Rafael points out that the B compiler suite actually relies on the current behaviour, and that it seems to be well adapted to other profiling or analysis tools, such as Devel::Cover. Gurusamy Sarathy recalls that Larry wished the CHECK/INIT blocks be run once per optree. Apparently that would be a job for a new kind of special block.
The state of the equivalents of CHECK, INIT, BEGIN and END blocks in Perl 6 seems to be mostly unclear to the P5P crowd, although Arthur Bergman crafted a module that provides PRE and POST to Perl 5 (Hook::Scope).
http://xrl.us/ks5 http://xrl.us/ks6
Gerrit P. Haase sent a patch for Pod::Perldoc on Cygwin, to disable the
default use of man(1)
to format output, because man is apparently broken.
Perhaps something to do with groff. Merijn applied the patch. John Peacock
found a workaround : setting the environment variable LESS=-R
(or
--RAW-CONTROL-CHARS
) instead.
http://xrl.us/ks7
Johan Vromans posted a RFC to choose a default way to handle multiple valued options in Getopt::Long. Nobody commented.
http://xrl.us/ks8
Alan Burlison had a look at the build process (on Solaris), and he notices
that perl
is linked against libraries it doesn't actually use. He's
investigating it.
http://xrl.us/ks9
Gurusamy Sarathy and Ed Avis talked about releasing Data::Dumper on CPAN.
This raises the question of the test suite, that should work over all perl
versions, which implies not relying on hash key order. But Jarkko did just
fix that test suite to enable the Sortkeys
option to Data::Dumper
(related to his hash seed patches). Brian Ingerson points out that YAML
might be a better tool than Data::Dumper for a large number of uses of
Data::Dumper.
http://xrl.us/kta
A new Perl Repository Browser has been put on line on ActiveState's side. The URL hasn't changed.
This summary was brought to you by Rafael Garcia-Suarez. Weekly summaries are available on http://use.perl.org/ and/or via a mailing list, which subscription address is perl5-summary-subscribe@perl.org . Comments, additions, multiplications, corrections, and suggestions are welcome.
anyway people seemed to dislike it, or to not like it enough
Maybe I'm biased, but I thought that the most important reason for dropping it was that it simply didn't work
Re:my %hash : predictable
rafael on 2003-06-30T22:45:03
Things that don't work might, eventually, be made to work. Michael Schwern had stronger arguments than you.Re:my %hash : predictable
nicholas on 2003-06-30T22:54:40
True. I was more thinking about the particular implemenation, and I did suggest ways to kludge it into working. And Schwern's arguments were independant of the implementation, and unanswerable.