"Ah, more details about filenames. Well, this sounds positively weird. Octet strings are not particularly user-friendly if you can't interpret them as characters reliably.
From what you say, and what I think I've heard elsewhere, Unix filename interpretation is a mess. Seems like the only bigger mess I've heard about is VMS file handling, where they seem to have a choice of several messes." -- Glenn Linderman, deep in the heart of Unicode, case conversion, filenames, encodings, character sets, ß and other exciting issues.
Tom Christiansen commented on Gisle Aas's perldoc shortcut (that
perldoc ipc
would redirect to perlipc
, assuming no ipc.{pod,pm}
existed), saying that in pre-5.8 times he had been working on a
technique to make perlipc
itself, run from the command-line, do
the same thing. Somewhere along the line, things went astray and
the work never made it the core.
not bitter, not really http://xrl.us/bk9mc
File::Path::mkpath()
incompatibility in perl-5.10 I had expected to make some progress on this issue, this week, but Real Life is eating my tuits like popcorn at the moment.
next week, cross my heart http://xrl.us/bk9me
I might preface this thread "on the almost impossibility to write a
correct summary of a complex subject". Marc Lehmann had written a
few weeks ago that a bare char *
through an XS API is fraught
with peril, because there is no metadata available to tell you if
it's Latin-1, KOI8-R, UTF-8 or something else.
The thread blossomed this week, with a long-running debate about what is broken (and when, and how). One point that was made is that Win32 encodes filenames in a particular way that doesn't really jibe with the rest of the internals. Unfortunately, it is only with hindsight that the problem really became apparent, hence the dilemma is that fixing it would break everything that has tried, with various degrees of success, to work around it.
The utf8
flag on SVs was again singled out as being responsible
for world hunger and other assorted ills, with a number of examples
demonstrating the problems.
Rafael Garcia-Suarez outlined an approach that just may be a way forward out of the mess. After listening to Juerd Waalboer, he thought that marking an SV as "binary" and thereby disqualified from being upgraded to Unicode would be quite useful.
Glenn Lindemann invented "blorf" as an opaque token for discussing the issues without people getting sidetracked over definitions of bytes, strings, characters, numbers and codepoints.
hard core http://xrl.us/bk9mg
David Nicol's tiny patch to document the empty pattern (m//
)
more clearly sparked a fairly intense technical debate over how to
get rid of the latter.
One point of particular interest was when Aristotle Pagaltzis
suggested a s///R
modifier which would return a modified copy
of the original string, instead of modifying the contents and
returning the number of matches made.
As it turns out, this would solve a number of problems very nicely, not the least being the elegantly succinct
my @changed = map { s/$this/$that/R } @list;
so let's have it already http://xrl.us/bk9mi
Not content with compiling perl with old gcc compilers, Bram took a very new one for a spin to see how things worked out.
It did of course go boom (otherwise you probably wouldn't be reading about it). Bram traced the problems down to typedefs and enums in system headers, and wondered how in Configure this could be sorted out.
duty now for the future http://xrl.us/bk9mk
Getopt::Long
, + options, installperl and +v Nicholas Clark was looking how to factor out the common code in
installman
and installperl
and noticed that the main sticking
point regarding installperl
was that it admitted a +v
switch
(and it does something else than -v
), using hand-rolled @ARGV
processing.
This precludes it from using Getopt::Long
because, while
Getopt::Long
can be taught to accept -x
and +x
, it offers
no way of discriminating between the two.
Johan Vromans said that as it turns out, with a bit of hand-holding, it is possible to coax the information out as things stand, and he plans to improve support for - and + switches in a future release.
Nicholas thought that a middle path might be to keep the hand-rolled
code, but adjust it to dump its results into an %opts
hash, which
would allow a drop-in replacement when Getopt::Long
gets updated
with the needed functionality.
This brought forth a long discourse from Tom Christiansen, who admitted to the wrong kind of laziness regarding command-line switches by resorting to hand-rolling code to deal with a solitary switch when in fact it would have been better to rely on a module. When he quizzed Larry Wall about it during the first decade of Perl's development, Larry admitted to rolling his own frequently, since it seemed a bit of a waste in his eyes to pull in a module for just one or two or switches for a program little more than a one-liner. As a peace offering for his own hand-rolling sins, Tom offered the list the ultimate file renaming Perl program.
bespoke options http://xrl.us/bk9mn
Marc Lehmann wrote a long response to Jan Dubois as a spin-off from the "On the impossibility of writing XS correctly", stating that Perl's Unicode handling because some parts of the core deal with Unicode one way, and other parts another way. This leads to annoying bugs, in that they are hard to identify, and hard to fix.
Tom Christiansen called him out for excessive use of rhetoric and asked him to clarify a couple of points. Several messages later Yves Orton offered a nice summary of the situation that showed where things break down. Then people started to speak about encodings, bytes, characters and character sets and as usual my eyes began to acquire that dead fish look.
see also http://xrl.us/bk9mp
On the subject of subjects on the problem of things, Yves Orton broke out into a new thread to discuss the schizophrenic attitude that Perl has when dealing with strings. He put forward a proposal for identifying and processing Unicode strings asked people to point out where he was wrong. Rafael Garcia-Suarez made a decent effort at doing just that.
Juerd Waalboer provided a contrarian argument, suggesting that Unicode works pretty well in Perl, insofar as one can have strings containing Unicode, and other strings containing binary data, because in a correct program, one usually doesn't have the two appearing in the same string. (such as having the Thai-encoded name of a Thai person concatenated with the slurped contents of a PNG file representing his signature in the same Perl scalar). In Juerd's eyes, the main problems come about when dealing with pure binary data and hoping that it doesn't wind of being treated as Unicode when it shouldn't.
more recommended reading http://xrl.us/bk9mr
As a followup to the above discussion, Juerd announced that he had released BLOB to CPAN.
http://xrl.us/bk9mt
English.pm
alias for %+
Amir Elisha Aharoni ventured for the first time into the waters
of p5p, suggesting that %NAMED_CAPTURE
would be a nice
English name for the new 5.10 %+
variable. Yves Orton thought
the idea was worthy of consideration, but one also needed to
deal with %-
at the same time, which could be named
%MAMED_CAPTURE_LIST
.
updating the babelfish http://xrl.us/bk9mv
_strptime('2001-2-29 12:34:56','%Y-%m-%d %H:%M:%S')
February 29, 2001 was not a leap year, so trying to format it is an error.
Apparently there is a test in Time::Piece
to ensure it fails in the
correct manner. Unfortunately, on some of the more exotic platforms
like VMS and OS/X, the call also correctly fails, but does so in a
way that fools the test suite.
at the third stroke it will be the 32nd of february http://xrl.us/bk9mx
Gisle Aas gave some additional background regarding Time-Piece-1.13
test failures on HP-UX, by forwarding a message he sent to Matt
Sergeant, the author of Time::Piece
.
http://xrl.us/bk9mz
H.Merijn Brand delved into HP-UX smoke reports to figure out
what was going wrong. Time::Piece
was already under control (see
above), but Math::Trig
was failing (and the only recent change
has been an upgrade to Math::Complex
). Tests for readdir
were
also turning black, which suggested subtler problems.
Half way through the conversation, Craig Berry announced the
integration of Gisle Aas's fix for Time::Piece
which addressed
the VMS problems, and H.Merijn reported that it did the trick for
HP-UX as well. Using the power of CPAN, H.Merijn was able to go
through previous Math::Complex
versions, and this allowed him
to resolve that problem.
I think the readdir
problem was solved by upgrading smoke
harness.
The remaining failure appeared to be caused by use blib
hoisting
in an errant directory into @INC
. Bram showed him how to fix that,
which should nail down the last error.
going for O O O O http://xrl.us/bk9m3
But then H.Merijn reported a problem with a failing blib test, and everyone pretended to pay attention to the character encoding debates.
war knocked http://xrl.us/bk9m5
Use Devel::Cover
to ascertain the core modules's test coverage, then add
tests that are currently missing.
Just to help budding testers along, here is a non-exhaustive list of suggestions
to get you going (suggested by sorting out the biggest .pm
files is lib/):
AutoLoader
AutoSplit
Benchmark
Cwd
DB
Dumpvalue
Exporter
Memoize
NEXT
SelfLoader
charnames
diagnostics
overload
warnings
Even concentrating on a single module would be helpful.
ExtUtils::ParseXS
- Error reporting problem with INTERFACE and ALIAS keywords About a year ago, Ken Williams explained that, while he was the maintainer of this module, he didn't know what was the best way to address the problem that Robert May had brought up regarding error reporting.
then http://xrl.us/bk9m7
Of the two approached supplied by Robert as a solution, Ken liked the second one back then, and Nicholas Clark, reviving the conversation agreed that it seemed to make more sense.
He had a look at how things work currently, and realised that with
a new function, he could effect a small saving of space. As a
result, both the core and EU::PXS
could rely on the function.
Nicholas wrote the function, and felt that it would make it into
5.8.9 and 5.10.1. or older releases, ExtUtils::ParseXS
would
need to bundle the function, and emit it as required if the core
didn't supply it.
Rob thought that this sounded reasonable, except that if ever a bug is found in the function that Nicholas just wrote, it would need to be fixed both in the core and EU::PXS. Since this would be less that desirable, Robert said that he would try to come up with an alternate patch at some point.
now http://xrl.us/bk9m9
lib.pm
should not warn about loading .par
files Paul Fenwick noted that a use lib 'Foo.par'
will issue a
warning, but load the damned thing anyway. Since someone pulling
in a library in this way probably has a pretty good idea what
they're doing anyway, Paul thought it would be a good idea to
suppress the warning, just for .par
files.
Rafael Garcia-Suarez felt that this made sense, so he applied the patch. Steffen Müller wanted to know if this meant that lib.pm would be dual-lifed, so that 5.8.8 could benefit from the improvements.
dual-life pragma on par http://xrl.us/bk9nb
Jerry D. Hedden noticed that some preprocessor defined in sv.c were not flush left, and thought that some compilers would choke on it. H.Merijn Brand explained that it was perfectly legal according to ANSI, although he admitted that some older compilers, such as on AIX, would likely get into trouble over this.
Both Robin May and Andy Dougherty explained that something that does work is to leave the # in the first column, and then indent the macro preprocessor directive as appropriate.
hash hard left http://xrl.us/bk9nd
*x{IO}
bizarre copying (#3314) Steve Peters discovered that some bizarre code that used to emit a bizarre error message now emits a more prosaic error message. He noticed that the change occurred way back in change #27179 and asked if anyone had objections to backporting it to 5.8.
a leap into the unknown http://xrl.us/bk9nf
exists()
: error message on wrong argument type is incorrect (#38955) A couple of years ago, Jeremy Hetzler noted that exists
may be applied
to a HASH, an ARRAY and also a subroutine name. The documentation even
admits as much.
On the other hand, for incorrect use, such as applying it to a scalar, the error message makes mention of only HASH and ARRAY, not of subroutines.
Bram patched the source to bring the error message into line with the documentation and implementation, and Rafael Garcia-Suarez applied it.
language lawyers rejoice http://xrl.us/bk9nh
Rafael Garcia-Suarez supplied a fix for the print Does::Not::Exist, ''
problem, so that the bareword is correctly identified as such, and not
stringified. Despite all the magic surrounding print
's first
argument, all that Rafael needed to do was to hoist a goto label four
lines higher in the source.
H.Merijn Brand applied the correction, along with Bram's tests.
http://xrl.us/bk9nj
pod2man
loses =head2 starting ' or . (#53910) Bram correctly identified Pod::Man
as a dual-life module. This means that
the best place to fix this particular problem is in the CPAN
distribution, which can then be synched with blead when the problem
is fixed.
SEP http://xrl.us/bk9nm
IO::Seekable
+ POSIX
= constant subroutines redefined (#54186) Part of the fallout from Nicholas Clark's corrections for this bug is that calls with the wrong numbers of arguments causes the program to croak. Rafael Garcia-Suarez felt it was safe enough to inflict on the world. As a point of confirmation, Sébastien Aperghis-Tramoni ran a code search and didn't find any examples of such usage.
safe to break http://xrl.us/bk9no
perlipc
problems Andrew at Sundale noted a problem in the documentation in perlipc
concerning the signalling of negative process IDs. Steve Peters tweaked
the example to show more clearly what was happening.
perlipc and negative pids (#54412) http://xrl.us/bk9nq
Andrew found another problem with setsid
, in that that the
documentation suggests a setsid or die
idiom, except that, if
one reads the manpage for setsid
, one learns that it returns -1
on error (as do many other system calls). As such, if the setsid
call fails, the die won't be triggered.
perlipc and negative truth (#54422) http://xrl.us/bk9ns
While we're on the subject, Andrew found one final problem concerning the documentation for safe pipe opens.
perlipc unclear on the concept (#54424) http://xrl.us/bk9nu
select()
in Activestate perl (#54544) Marc Lehmann noted that select
returns "Unknown Error (10022)"
instead of simply timing out.
just no it http://xrl.us/bk9nw
@ISA
(#54566) Niko Tyni discovered a way of abusing @ISA
that would result
in an assertion failure. Rafael Garcia-Suarez figured out what
was going wrong in mg.c and provided a patch, that H.Merijn
Brand applied.
out through the smtp tunnel http://xrl.us/bk9ny
Lourdes Peña Castillo reported that on some versions of perl, but not others, the number 2.5e-310 gets rounded down to 0, and the log of 0 is negative infinity.
Various porters reported similar behaviour on a variety of perls, platforms and Configure options, but no clear reasons why.
now you see it, now you don't http://xrl.us/bk9n2
PerlIO::via
free unrefed scalar on certain dodgy code (#54686) Kevin Ryde wrote some slightly broken code that managed to make the perl interpreter complain about memory problems. He wasn't especially worried about a fix any time soon, but wondered if it was a symptom of an underlying problem that needed to be addressed.
need to know http://xrl.us/bk9n4
Ed Avis filed a feature enhancement request, to allow the
/n
flag on a regular expression to indicate that no
interpolation should be performed.
Currently, only m'300 $US'
(with single quotes as a pattern
delimiter) does no interpolation. Ed thought that /300 $US/n
might be clearer.
we'll get the whole alphabet in some day http://xrl.us/bk9n6
PathTools-3.27
triggers a bug in Perl (#54728) Jan Dubois isolated a problem in File::Spec::Win32
's catfile
function. The fix from the client side is to stringify a $1
passed as a parameter (a variation on the "better to be paranoid
than sorry" theme), since catfile
appears to clobber it with
some other action before getting around to using it. Ideally,
catfile
should stringify its arguments itself, although Jan
wondered if there was a more general way of solving the problem.
match point http://xrl.us/bk9n8
278 new + 1345 open = 1623 (+13 -43) http://xrl.us/bk9oa http://rt.perl.org/rt3/NoAuth/perl5/Overview.html
Thread::Semaphore
Jerry D. Hedden released 2.08, which adds a few checks for undefined parameters.
http://xrl.us/bk9oc
Ricardo Signes wondered why delete local $hash{elem}
didn't work
when local $hash{elem}; delete $hash{elem}
did. After boggling
briefly over the syntax, Rafael Garcia-Suarez thought it wouldn't
be too hard to make it work.
http://xrl.us/bk9oe
Ricardo Signes looked at the documentation in perlobj
and
corrected errors and omissions in DOES
. He hinted that he
would take the axe to the documentation for UNIVERSAL
.
less is more http://xrl.us/bk9og
Jerry D. Hedden corrected a typo in perlop.pod that H.Merijn Brand estimated as being a difference of about 3 pixels, thus possibly qualifying for the smallest patch ever.
http://xrl.us/bk9oi
He also silenced build warnings in universal.c .
http://xrl.us/bk9ok
Nicholas Clark discovered what he thought was a usage error in
XS subs
with the ALIAS keyword. This reminded Robert May that he
had written about a similar problem with INTERFACE last year, and
that the message had gone nowhere.
http://xrl.us/bk9on
Florian Ragwitz also managed what was roughly a seven pixel
change to fix a documentation typo in Attribute::Handlers
.
http://xrl.us/bk9op
Artur Bergman handed over maintenance of Attribute::Handlers
to
Rafael Garcia-Suarez.
http://xrl.us/bk9or
Memoize.pm
refers to old title of "Higher Order Perl"
and changed the wording. There was some discussion as to whether the full
text of HOP was available on the web, and if so, where? http://xrl.us/bk9ot
After Steve Peters performed an upgrade to AutoLoader
to bring it
to 5.66, Nicholas Clark bumped it up to 5.66_01 to be on the safe side.
for the record http://xrl.us/bk9ov
Craig Berry returned to the File::Copy
& permission bits issue,
saying that changes were unlikely to fly on VMS. Aristotle Pagaltzis
pointed out that on Windows, files tend to inherit their permission
bits from the directory in which they reside, and that the only
important bit to honour on Unix systems is the execute bit.
http://xrl.us/bk9ox
Renée Bäcker was Warnocked over a patch to add more documentation to attributes.pm .
http://xrl.us/bk9oz
This summary was written by David Landgren.
Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.
If you found this summary useful, please consider contributing to the Perl Foundation or attending a YAPC to help support the development of Perl.