Perl 5 turns 18 and is given some lovely birthday presents, such as Perl 6's switch, smart match and say. Constants get smaller, faster and cheaper. So with all that and more, it's probably time for 5.9.3 to escape from the lab.
sprintf
, all the time, everywhere The sprintf
patch effort proceeded smoothly this week. Andy Dougherty
phoned in to give the patching of Solaris 8 a clean bill of health. But
Nicholas Clark wanted to get the other segmentation violations nailed
down as well, and suggested a new set of patches. Which didn't work on
Windows, and so the set was refined.
Dominic Dunlop reported perls 5.8.0 through 5.8.7 on Mac OS/X was looking good, albeit with minor unrelated warnings. Steve Hay reported success on Windows for 5.8.0, .2, .3 and .7. Nicholas published the patches on CPAN.
http://xrl.us/jbwc
Gisle Aas modified the way sprintf
obtains indexes and widths from the
format string to check for overflows. As a result numbers larger than
IV_MAX+1
(already ridiculously large on 32 bit hardware) can no longer
be used directly to specify a width, but as a) such values are probably
not used in practice and b) there are still indirect means of specifying
Very Large Widths (such as %*d
), it's not much of a loss.
http://xrl.us/jbwd
Last week, Craig Berry pointed out an annoyance with Pod::Simple
: its
test suite contains directories with names such as lib/Pod/Simple/t/test^lib.
This is a hassle on VMS, because circumflex is not a legal character for a
filename, and he hoped that the directories could be renamed by changing the
circumflex to an underscore.
This week, John E. Malmberg pointed out other problems of filename
portability. Nicholas Clark reported that the directories had been renamed
to something sensible, but was now struggling with trying to get Perforce
to delete the now-empty troublesomely^named^directories
. Rafael came to
the rescue with the correct incantation.
http://xrl.us/jbwe
strict 'refs'
doesn't apply inside defined
Nicholas Clark noticed that the following snippet produced no error:
use strict; $a = "Not documented"; print defined @$a; # prints nothing, no error message $a = "Not documented"; print defined %$a; # prints nothing, no error message
... which is certainly a bug, and was worried that production code
might be relying on this misbehaviour. Since applying defined
to arrays or hashes is vaguely deprecated, Rafael Garcia-Suarez
thought that it wasn't very important, and probably some odd side-effect
of the optimiser. Robin Houston came to the heart of the problem by
showing that that defined()
doesn't autovivify stash entries when
used via a symbolic reference:
% perl -le '$n="z"; %$n; print(exists $::{$n} ? "yes" : "no")' yes
% perl -le '$n="z"; defined%$n; print(exists $::{$n} ? "yes" : "no")' no
... and also showed where in the source this was happening.
http://xrl.us/jbwf
Rafael Garcia-Suarez noticed that all the internal datatypes exhibit
similar behaviour, but when he patched the necessary functions to
fix it up an odd assortment of tests began to fail. One of the more
icky cases was DBM_Filter
, with the snippet:
C<if (!defined %{"${class}::"})>
Rafael want to at least fix up defined($$foo)
, but Robin was doubtful,
thinking that there's probably code Out There that relies on this
behaviour. And we got back to the comment in Perl_ck_defined()
that
has a warning comment about not breaking Tk
(This issue has been
hovering in the fringes of a number of threads in the past few months).
Nicholas then posted a particularly brilliant analysis of the problem. It's well worth reading if you want to get a feel for what goes on in perl's guts
http://xrl.us/jbwg
It comes down to the fact that people are not so much trying to see
whether hashes (or. more precisely, stashes -- symbol tables hashes)
are defined per se, but rather they are trying to determine what
packages have been loaded below the current package (that is, A::B
wants to know if A::B::C
or A::B::D
are loaded. In the current
state of affairs, just thinking about A::B::C
makes it spring
into life. So Rafael applied change #26370 to tweak the lexer to make
it all work again.
http://xrl.us/jbwh
sprintf
of version objects Gisle Aas filed bug #37897 regarding how sprintf
deals with version
objects, as part of the fallout from the webmin
/syslog
debacle.
John Peacock was at a bit of a loss as to where to start looking, and
said that sv_vcatpvfn()
was an example of overloaded operation gone
wrong. After mulling over the problem for a while, John came up with
a simpler approach that Gisle liked. So John went ahead an implemented
it. Gisle came back with a couple of suggestions to tighten it up.
H.Merijn Brand came up with a good test case:
# all on one line ($_=sprintf"%vx\n",$^V)=~s^\d\d.3\d.2^Just \ Anoth^;s$.3..2$r P$;s;.3.;rl Hacker,;;print
At the end of the thread, John reminded us about far more than we ever wanted to know about version objects.
http://xrl.us/jbwi
Pod::Man
creates empty man pages Randal L. Schwartz wrote in to say that podlators-2.0
broke an interface
assumption (a parser object could not be reused) which caused a number of
things (heavy lifters like pod2man
, ExtUtils::MakeMaker:MM
and
Module::Build
) to fail. Russ Allbery said that pod2man
was fixed in
the podlators
distribution, but the main reason is Pod::Simple
objects
are single use once only. The new maintainer is interested in resolving the
issue, but the solution is not simple.
Ken Williams was ready to patch Module::Build
, but pointed out that if
someone was using M::B
or EU::MM
and after upgrading Pod::Man
noticed that their man pages stopped working, then their first reflex
would be to see about seeing what was wrong with Pod::Man
. It might
not be obvious that the correct fix would be to upgrade M::B
or
EU::MM
. So at some point in time Pod::Man
does need to do The
Right Thing.
http://xrl.us/jbwj
Every once in a while, someone pops into p5p with a very (on the surface) simple question that turns out to be fiendishly complex. Mark Glines won this week's award for the best question.
It turns out that Mark has written a Perl XS wrapper for the Fuse project.
http://fuse.sf.net/
Fuse, as I understand it from Perl's perspective, is that you just
use Fuse
and you create a filesystem whose entries can be mapped
to Perl routines. The Perl interface, until now, has been a
single-threaded implementation (mainly because it's been around
since 5.6.1). Mark recently looked at 5.8's thread support and
decided it looked much better, and set about investigating how to
make a multiple-threaded version of Fuse. He had a number of
questions.
Nick Ing-Simmons dived in with a number of detailed responses to Mark's questions, explaining how perl's threads behave. In fact, with a bit of help from Nick and Yitzchak Scott-Thoennes, Mark was well on his way to getting it all work quite well.
http://xrl.us/jbwk
Pump(?:king|queen)
chair is up for grabs Two and a half years ago, Fotango began sponsoring the Perl 6
development effort, in the shape of Ponie, or an implementation of
Perl 5 on Parrot. Nicholas worked on the project during his stay
at Fotango, but he's now moved on, and Ponie is looking for a new
manager. If you're interested, drop Nick and/or Jesse a line at
ponie-pumpking@perl.org
.
http://xrl.us/jbwm
Nicholas wondered whether it would be worth reducing the use of macros
by changing things like Nullsv
to (SV*)0
, the reason being that
it's three things that people have to remember, and three things that
newcomers have to learn.
Andy Lester thought it might be better to replace the Null[a-z]+
variants with NULL
. Jan Dubois remembered a case where it does
matter: when using vararg
-style functions, H.Merijn pointed out
that the explicit casts gives the compiler something bare NULL
s
don't: hints about alignment. Specifically, on some architectures
(ok, HP) the following is safe:
long l; *((char *)&l) = 'a';
but
char a[160]; *((long *)a) = 100L;
may result in phase-of-the-moon errors. Nicholas applied a couple of experimental patches to see what it would look like.
http://xrl.us/jbwn
PeekMessage()
call in win32\win32.c win32_async_check fixed Following on from last week's thread about alarm()
on Win32,
Jan checked in some code to address the problems. Rafael had a bit
of trouble with the associated test suite, and it was all sorted out
in the end
http://xrl.us/jbwo
Steve Hay observed that somewhere between changes 26150 and 26185, changes to sv.c caused the above test file to fail. After doing a binary search among those change, he pinned the problem on change 26165, but couldn't see what in the patch was causing the problem.
Nicholas and Steve went over the parts of 26165 and began to apply them individually, looking for mistakes. It turned out to be a trailing comma at the end of a very heavily-casted subtraction.
After a lot of patient untangling of the initial patch, Nicholas got it all sorted out. A fascinating case of debugging by mail. All fixed up in 26378.
http://xrl.us/jbwp
Brendan O'Dea told of how Debian now builds a threaded perl by default, and one of the biggest complaints is the slowdown that that induces, and worse, segfaults. A couple of ideas where discussed, but no firm conclusions were made.
http://xrl.us/jbwq
Robin Houston wrapped up his work on the new smart match operator,
the ~~
operator from Perl 6. The patch was over 100 kilobytes.
Nicholas carefully applied two parts of it, that corrected a
couple of spelling errors in comments. which tickled Robin no end.
After having looked more carefully at the patch, Nicholas
decomposed into six discrete components. Rafael
was a bit dubious about the use feature
approach to implementing
smart match (and switch, and say)... all of which are provided
by this patch. He thought it a pity that a pragma was required
to get at recent coolness added to the language, but backwards
compatibility being what it was, that it was the best way forward.
Yves Orton ironed a few kinks out the patch to get it running cleanly under threads and Windows.
http://xrl.us/jbwr
Robin later followed up with another version of the patch incorporating Yves's changes and a tweaked version of smart match semantics (subject to more possible semantic changes when Larry and/or Damian touch base). Applied by Rafael as change 26400 and a tweak in 26416.
http://xrl.us/jbws
constant
s Nicholas Clark noted that, despite optimisations and inlining, constants take up a lot of memory:
$ perl -MDevel::Size=size -le 'sub c () {3}; print size(\&c)' 434
... and wondered whether there was a viable way of reducing their
size. One approach would be to store the scalar value directly,
but that would require some collusion between the optimiser and
the pp_entersub
opcode. Over seven years ago, Ilya Zakharevich
produced a patch to bring the cost of a subroutine declaration
from 500 bytes down to 100 (change 965), one possibly non-documented
side effect of which is:
$ perl -lwe 'sub foo {}; sub bar; sub baz ($); \ print "($_)" foreach $::{foo}, $::{bar}, $::{baz}' (*main::foo) (-1) ($)
Nicholas wondered whether it would be worth replacing this mechanism
by another whereby something else than a typeglob could appear in
symbol tables and could be used for storing constants. For modules
sych as Socket
and Fcntl
that implement constants internally,
it would become possible to generate true inlineable constants at
require time, rather than at run time.
The main stumbling block is that as a constant is secretly prototyped
as ()
behind the scenes, the trick is figure out how to distinguish
between a true prototyped routine and a constant value.
After some more legwork, Nicholas discovered that pp_entersub
can be fed just about any old thing, which would therefore require
a non-trivial amount of cooperation from higher-level ops. And the
worst of it is that as a general rule, most constant subs are never
actually called in any particular program.
The new scheme that Nicholas came up with is quite clever. Now, when a reference to a constant value is stored in the symbol table, like
$foo::{bar} = \3;
It stays that way (occupying only as much space as a reference) until
it gets called upon. At that point, it springs into sub foo::bar () {3}
and can be inlined as needed. What is more, any constant-building
mechanism that looks like this get the optimisation for free. But
wait! there's more! The first time Nicholas took the new code for
a test drive, he picked up an error (a missing semi-colon in
Tie::File
).
Benchmarking the new technique revealed that 1% less memory is
consumed when Fcntl is require
d and a massive 12% less when
use
d. And all its constants are now inlined. The POSIX
shows
even better improvements (which will put to rest the meme about
writing use POSIX ()
to avoid importing the universe). Bonus!
Even better still, the memory saved is that much less memory needed to be cloned when threads are created. A truly momentous change.
http://xrl.us/jbwt
Later on, Nicholas figured that it should be possible to teach
Devel::Peek
how to dump out the value of constant subroutines,
and committed a patch (26404) to do just that. Dave Mitchell
liked the output.
http://xrl.us/jbwu
//g
loops infinitely on tainted data Steve Peters revived an old bug (#8262) where Kirrelly "Skud" Roberts noticed that the follow snippet went into an infinite loop when tainting is active
while($ENV{LANG} =~ m/(.)/g ) { print "saw '$1'\n"; }
Sadahiro Tomoyuki noticed that it occurs with certain, but not
all, tainted values. For instance $ARGV[0]
loops, but its
copy $copy = shift
does not. Mike Guy reasoned that bad magic
was causing the trouble, no doubt some interaction between
magic and taint causing pos()
to be reset.
This last clue was sufficient for the light-bulb to go off over Dave Mitchell's head, and create a patch that fixes the problem.
http://xrl.us/jbwv
Dave Mitchell noted that Perl is now old enough to drink hard liquor,
vote and join the army. Randal Schwartz remarked that that's a long time to
be printing "Hello, world". Joshua Juran recommended that Randal purchase
a book called, um, "Learning Perl", currently in its third edition.
brian d foy
suggested that Joshua buy a new copy, since it's now
in its fourth edition.
http://xrl.us/jbww
Joshua ben Jore sent in a patch against blead to teach B::Lint
to accept plugins. The particular scratch he wanted to itch was the
ability to check for unchecked system calls and non-constant sprintf
formats. Rafael wondered whether Joshua planned to backport the
current B::Lint
checks as plugins in the new regime. Joshua
wasn't convinced of the value in doing so, to which Hugo van der
Sanden pointed out that it would provide a good set of examples to
show how future plugin authors could write their own. But as the
plugin mechanism has a certain overhead, the cost would be too great
to convert everything over.
http://xrl.us/jbwx
sort
pragma lexically scoped Robin Houston thought that the use sort
pragma (as in
use sort '_quicksort'
, use sort _'mergesort'
) should be
lexically scoped. And having a couple of moments to spare after
having written switch
and other goodies, whipped up a patch
to do just that. As a result he had to rewrite part of the test
suite which was relying on the previous behaviour, but that no-one
minded that.
Nicholas did raise a concern about pragmata namespaces, which Robin addressed in a followup patch. All applied by Rafael.
http://xrl.us/jbwy
Robin also supplied a patch to test for %^H
(the hints hash)
being propagated into eval
.
http://xrl.us/jbwz
perlivp
too fussy? Gisle Aas doesn't like the way perlivp
(Perl Installation
Verification Procedure) moans about incorrectly installed *.ph
files, which isn't really a problem under ActiveState Perl,
and supplied a patch to silence these warnings.
Rafael noted that perlivp
doesn't play nicely in this day and
age of package-managed operating systems, and that Mandriva doesn't
bother distributing it. Gisle thought that another solution would
be to get rid of it altogether. Michael Cummings (who does Perl
on Gentoo) voted for either removing it, or at least updating it
so that it at least prints out a description of each test as it
runs.
http://xrl.us/jbw2
err
be a feature? Up until now, err
has been implemented as a weak keyword, which
in a nutshell means that it operates as a keyword only if a subroutine
of the same name has not been declared previously. Now that Robin
has introduced the feature
pragma, he wondered whether it would
not make more sense to recast err
as a feature.
In case you haven't been following Perl 6 too closely, err
is to
//
(defined or) as and
is to &&
. Rafael quite rightly pointed
out that since //
isn't a feature, (because it doesn't clash with
anything apart from an obscure regexp construct) it seems odd to
require err
to be featured. The crux of the matter is the trade-off
between backwards compatibility and using new, ah, features.
Also proposed was the idea of use feature ':5.9.3'
, use feature ':5.10'
and use feature ':all'
as ways of getting everything new that the
interpreter has to offer, with possibly various degrees of pain inflicted
on getting your code to compile.
Sébastien Asperghis-Tramoni fired up his own private gonzui search
tool, and discovered that 69 distributions on CPAN currently define
sub err
.
Hot on the heels was the need for
a command line switch that would have the same effect, in order to
err
, say
or smart match in a one-liner. Since the best candidate
(-f
) has recently been used for another purpose, Robin posted the
list of unused command-line switches
b E g G H j J k K L N o O q Q r R y Y z Z
... and wondered which one would be the best for this purpose. Abigail
suggested -6
only not really. More interesting was Abigail's idea
to just enable the whole lot when -e
is used, since in a one-liner it
would be highly doubtful that err
ever needs to be diambiguated, or
else -e
uses all features, and -E
uses none.
Then it started to get really silly, with Tels suggesting to use æ (ae ligature).
http://xrl.us/jbw3
In a followup post, Robin produced a patch that does the reverse of
Abigail's sensible idea: that is, -E
is like -e
with the
addition of all features.
After the discussion featuring err
, Rafael thought that with the
addition of all these new shiny, erm, features, it must be time to
roll out a new version of bleadperl to give it some wider exposure.
Some things that need to be done are: get feedback from Larry and
Damian on smart match, better (read: more) tests for modules like
DBI
and Tk
and implementing feature 'err'
.
Robin promised to wrap up the documentation for the work he's done so far, on the understanding that semantics may change from here to 5.10, and Rafael said that it might be nice to try for 1/1/2006 as a release date. Robin checked in some documentation for switch and smart match a few days later.
http://xrl.us/jbw4
A long discussion followed revolving around err
, and whether it
could be possible to promote it to a full keyword. Rafael disliked
the idea, because it would introduce source-level incompatibilities,
Gisle proposed cutting the Gordian knot by just simply removing
the err
keyword altogether. H.Merijn said that this would
remove a certain amount of orthogonality from ||
/or
and
&&
/and
. Yves thought that err
was a pretty dumb name, and
put forward dor
as an alternative. Randy W. Sims concurred,
and thought that default
would be better than err
. But as
it comes from Perl 6, it appears we don't have that luxury.
Yves's objection is that err
sounds too much like error, as in
Danger! Danger!, when it really means "I half expect this to not
work, but that's ok. Another valid argument was that the err
object in Visual Basic is the way you throw errors, so VB refugees
are going to have a hard time unlearning it in Perl.
To summarise: just about everyone thinks that err
sucks, but
no-one can think of anything better.
The summariser thinks that since the purpose of the operator is to let you try out something, and if that doesn't work, get something else, then it should be called the Jagger-Richards operator, because you can't always get what you want, but you might find you get what you need.
Rafael wondered whether, with Robin's implementation of switch in
the interpreter, whether it would be possible to fast-track the
deprecation of Switch.pm
. Robin pointed out that the semantics
of his switch are not quite the same as Damians source filter
switch. In the end, though, since 21 CPAN modules (again thanks
to a search from Sébastien's) already use Switch, it'll stay.
http://xrl.us/jbw5
err
After having done six Impossible Things before breakfast, Robin finished up with a patch to implement feature bundles, as discussed above.
http://xrl.us/jbw6
I32
be replaced with IV
or STRLEN
? After having started to look at the array index issue, Jan Dubois noted that there are many other cases where signed/unsigned types are used inconsistently.
Worse, in the case of 64-bit builds, some temporary copies of things get shoved into hard-coded 32-bit variables, which admits the possibility of nasty wrap-around errors.
Rafael wondered what impact the proposed change would have on the recompilation of XS modules, and other checks that might cause slowdowns.
http://xrl.us/jbw7
Ken Williams uploaded a beta PathTools
(3.14_02). This incorporates
the "stop at first directory in PATH" fix that was giving Nick
Ing-Simmons grief a few weeks ago. If all goes well, a 3.15 will be out
shortly.
Steve Peters closed out at least two bugs that I was paying attention to, 7567 and 23907. And 23357 and 27319 as well.
Robert Spier noted that in the past seven weeks, all bugs have been replied to. 1511 open tickets as of December 19.
http://xrl.us/jbw8
The list http://rt.perl.org/rt3/NoAuth/perl5/Overview.html
Sorting within a sort does not set $a
, $b
was filed as bug #36430
back in June 2005. Robin (Houston, I suppose, the RT interface isn't clear
on the matter) poked at it in late October. Steve Peters applied a change
to fix it, only to realise that it had already been fixed by change #25953.
http://xrl.us/jbw9
Jarkko Hietaniemi followed up on his valgrind
findings from last
month and October with a short snippet that demonstrates ampersandy
leakage:
s/a//e; print eval '$&';
http://xrl.us/jbxa
A problem using local
with threads::shared
(bug 37671) inched
forward a bit closer to resolution when Steve Peters realised that a
more restrictive approach was needed when dealing with magic and
Yitzchak agreed with the idea.
http://xrl.us/jbxb
Jim Cromie refined his work on arenaroot
s.
http://xrl.us/jbxc
Thanks to last week's summary, Andy Lester realised that he had let a message from Jim slip, and answered Jim's questions.
http://xrl.us/jbxd
Vadim Konovalov noted that s{}{goto uncool;uncool:;}e
produces
the error Can't find label uncool at -e line 1
, but that
s{}{do{goto uncool if 1;uncool:;}}e
(wrapping the inner
statement in a do
block) works. After a moment of astonished
silence, Chip Salzenberg said that this was perhaps the most
evil thing he'd ever seen.
http://xrl.us/jbxe
Rafael discovered that the INSTALLSCRIPT
doesn't follow
changes to INSTALLDIR and that it's probably a bug. No-one
complained about possible backwards compatibility breakage,
but for the time being things are staying as they are.
http://xrl.us/jbxf
Andy Lester announced the Sys::Syslog
and sprintf
fixes
on the Perl foundation blog:
http://xrl.us/jbxg http://news.perlfoundation.org/2005/12/updated_perl_modules_alleviate.html
Jerry D. Hedden posted bug #37946 concerning refaddr
s of threads::shared
variables. Rather than staying constant, it flip-flops between two addresses.
Nicholas Clark explained why it was so. Dave Mitchell scuffed his toe, looking
for tuits.
http://xrl.us/jbxh
See also his other bug, #37919
http://xrl.us/jbxi
Yves Orton supplied a patch to fix the script embedded in patchlevel.h in order to make it work on Windows.
http://xrl.us/jbxj
Andrew Savige started tracking down perl memory leaks with valgrind
and noticed a leak with constant subs with prototypes and asked
for some tips on how to proceed. Andrew, meet Jarkko. Jarkko, meet Andrew.
http://xrl.us/jbxk
Nicholas looked at how Perl_yylex
tokenises version strings
and couldn't see how a certain piece of code should be reached.
Robin Houston supplied a one-liner to show how it could, and Nicholas
thanked him for the insight, noting that there now remained only
3000-odd lines of similarly incomprehensible code.
http://xrl.us/jbxm
This summary was written by David Landgren. Sorry about the delay, Santa ate all my tuits. No doubt Adriano will do a much better job next week. This summary took me far too long to write, and in the interest of matrimonial harmony, hasn't had the care it deserves applied, so the quotient of typos, wordos and other oddities will probably be higher than usual.
Information concerning bugs referenced in this summary (as #nnnnn)
may be viewed at http://rt.perl.org/rt3/Ticket/Display.html?id=nnnnn
Information concerning patches to maint or blead referenced in
this summary (as #nnnnn) may be viewed at
http://public.activestate.com/cgi-bin/perlbrowse?patch=nnnnn
If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff:
http://www.landgren.net/perl/
Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.
If you found this summary useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl.