This Week on perl5-porters - 24-29 February 2008

grinder on 2008-03-06T22:52:00

This Week on perl5-porters - 24-29 February 2008

"Is this a bug? Or why is this the expected behaviour?" -- Steffen Ullrich, playing with signal handlers.

Topics of Interest

use encoding 'utf8' bug for Latin-1 range

The thread about use encoding continued this week. Juerd Waalboer gave one of the best concise explanations as to why the current model Perl uses for dealing with Unicode is broken, which is that the \x hex escape is overloaded for bytes (\x2b versus \x{d0b2}), and that it takes place too early, while the source is being read.

The result of which is that a source code file encoded in an Asian language cannot embed a latin-1 character like an e-acute.

Much discussion of remarkable civility followed, regarding what to do about the matter. Glenn Lindemann put forward the following ideas:

  • Deprecate use encoding.

  • Deprecate non-ASCII characters in 5.12 source code, unless a source encoding has been specified.

  • Allow Unicode semantics to be applied to all character operations on strings (case conversion, caseless comparisons and so on), regardless of their internal representations.

  • Sort out the timing of when \x, \x{} and \N take effect.

No-one appeared to lament the idea of letting encoding go.

Yves Orton pointed out that Microsoft managed to get their Unicode handling more or less right, albeit at a certain cost to their API, and regretted that Unix-like operating systems supplied the absolute strict minimum, pushing all the work onto each and every client program. Which meant that nothing really worked at all, not even the so-called shebang line.

Juerd and Nicholas put forward that there is a case to be made for perl to figure out itself whether a given source file is in ASCII, Latin-1 or UTF-8. It turns out that it's just about impossible to construct a sensible Latin-1 file that also turns out to be be valid UTF-8. The idea is to start out in 7-bit ASCII and carry on until a byte with the high bit set is encountered.

If this byte introduces a valid UTF-8 character, the rest of the file must be, too. Any invalid byte sequences thereafter trigger a fatal compile-time error. Otherwise it means it must be Latin-1, in which case similar but different rules apply which also cause the compilation to halt if encodings change mid-stream. The key issue is to determine that the encoding does indeed change.

EBCDIC was also mentioned in passing. Sadly, Perl no longer runs on EBCDIC due to a general lack of nurturing. Then again, if it was important, Nicholas felt that someone from IBM would have been in touch at some point.

  for some reason I now have a splitting headache
  http://xrl.us/bg932 

Interrupting system() with signal depends on signal handler

Steffen Ullrich noticed that an alarm signal handler that does a syswrite as opposed to a print behave differently. After diving in through pp_sys.c, he noticed that he could make the print version (which was working correctly) behave the same incorrect way, by setting $! to undef.

He produced a one-line patch that fixed the behaviour (hmm, did we get a test?) and Rafael applied it as change #33408.

  handle with care
  http://xrl.us/bg98g 

CPAN NetBIOS broadcasts

Linda W was scratching her head wondering why CPAN installations on cygwin were glacially slow. After running a network trace, she discovered that what had been a path /var/cache/cpan was being interpreted as a UNC path (/cache/cpan on host //var).

This caused the local host to send out plaintive calls for host //var to please call home. Michael G. Schwern thought that this sounded like the same problem described in CPAN bug #32813, as did Linda.

Yves Orton, current maintainer of ExtUtils::Install, which is were the problem originated, pushed out a new version and Linda confirmed that it solved the problem.

Ken Williams was not around to comment on how hard it is to use File::Spec correctly.

  not quite Unix, not quite Windows
  http://xrl.us/bg934 

Google summer of code

Eric Wilhelm got the ball rolling on Perl's participation in Google's Summer of Code project. But you've probably heard about this in other venues. All hail Eric.

The Perl 5 Wiki is place to go for the latest information.

  summertime fun
  http://xrl.us/bg936
  http://xrl.us/bg938 


Patches of Interest

sv.c consting goodness

Steven Schubiger's consting patch number 4 from the beginning of the month was applied. This lead to patches 5, 6, 7, 8 and 9, all applying ever more consting to sv.c being issued by Steven, which in turn were all applied by various porters.

  http://xrl.us/bg94a 

no archlib in otherlibdirs

After some long, hard thought, Andy Dougherty remembered why Reini Urban's plan for organising site and vendor libraries on Cygwin wouldn't work in the general case. So Reini withdrew his patch but would continue to use it locally.

  http://xrl.us/bg94c 

On the other hand, his enhancements to B::Debug made it in.

  win some, lose some
  http://xrl.us/bg94e 

warning message for -M:Foo, extended and revised

Robin Barker finally settled on ``Invalid module name :Foo with -M option: contains single ':''', which was good enough for Rafael

  colonphun
  http://xrl.us/bg94g 

More diagnostics for Fatal.pm

Slaven Rezic enhanced Fatal to name the builtin that could not be overridden in its dying message.

  if I told you I would have to kill you
  http://xrl.us/bg94i 


Thread patches

Jerry D. Hedden is doing so much work on threads at the moment, he deserves his own section.

First off, the patch to not install threads on non-thread builds was reverted (Michael G. Schwern killer argument being that at least that way you get a nice error message).

  http://xrl.us/bg94k 

Then the CPAN 1.69 version of threads was synch'ed with blead.

  http://xrl.us/bg94n 

As was threads::shared 1.17.

  http://xrl.us/bg94p 

At the end of the week, he also delivered version 1.18, which added some diagnostics to help track down what's going wrong when t/stress.t decides to go belly up.

  http://xrl.us/bg94r 

Moving along, Thread::Semaphore 2.07 checked in.

  http://xrl.us/bg94t 

and last but not least, Thread::Queue 2.06 did too.

  http://xrl.us/bg94v 


Watching the smoke signals

It looked like t/stress.t in the threads module failed, and so Jerry asked if there was any chance of seeing what the new diagnostics had to say. Steve Hay discovered that the problem was in fact a TODO test that had started to pass, and Test::Smoke got confused and recorded it as a failure.

  Smoke [5.11.0] 33390 FAIL(F) MSWin32 WinXP/.Net SP2 (x86/2 cpu)
  http://xrl.us/bg94x 


New and old bugs from RT

Segfault when calling ->next::method on non-existing package (#51092)

David Landgren thought that the test that Rafael Garcia-Suarez added as part of the fix for this bug should have had the RT bug number embedded in it somewhere. In other other news, we discovered that there are 485 subscribers to perl5-porters.

  http://xrl.us/bg94z 

Perl5 Bug Summary

  288 new + 1500 open = 1788 (+3 -2)
  http://xrl.us/bg943
  http://rt.perl.org/rt3/NoAuth/perl5/Overview.html 


New Core Modules

ExtUtils::Install version 1.45

This was the fix for the //var problem noted by Linda W. (But stay tuned next week for exciting new developments).

  http://xrl.us/bg945 
ExtUtils::MakeMaker 6.44

Michael G. Schwern rolled out 6.34_01 plus Yves's EU::I 1.45 as version 6.44. Other assorted bugfixes made it in, but Michael announced that he had declined to put in the fixes required to make paths with whitespace work correctly, saying that he wanted to think about a better solution.

  http://xrl.us/bg947 


In Brief

Last week, Jim Cromie had the newfound ability to hook XML analysis to a test suite (via the PERL_XMLDUMP environment variable). This week, Jim wrote a patch to test -Dmad's PERL_XMLDUMP= output. It was not applied.

  truly madly
  http://xrl.us/bg949 

On the other hand, Rafael did apply his optimisation of the OP_IS_(FILETEST|SOCKET) macros, with some OP */int fuzz.

  http://xrl.us/bg95b 

The exact recipe for signalling a non-met prerequisite (such that a perl build without threads should not attempt to require threads) was nailed down and codified on the CPAN Testers wiki.

  http://cpantest.grango.org/
  http://xrl.us/bg95d 

Salvador Fandiño found that the documentation made no mention of av_delete calling sv_2mortal on the returned SV . Yet av_pop and av_shift don't and so the documentation should probably point out the difference.

  quirk quirk
  http://xrl.us/bg95f 

Craig Berry reported that maint-5.8 was not compiling on VMS, largely due to incorrect prototypes in re.xs . Nicholas Clark determined that a subsequent integration fixed the problem.

  a matter of time
  http://xrl.us/bg95h 

Steve Peters wanted to know why quad words on Win32 weren't configured, since all the pieces were in place to allow them to be. Jan Dubois thought that it wasn't much of a problem since you really need to have IVSIZE defined to be 8 to take any advantage of them.

  mmm, bignums
  http://xrl.us/bg95j 

Nicholas Clark hacked perlbug to allow it to send thank-you messages back to the porters.

  send more money
  http://xrl.us/bg95m 

Nicholas also got his languages mixed up trying to write else if in C macros. Fortunately there are only four or five distinct syntaxes to master for writing else if constructs in all computer languages.

  as if
  http://xrl.us/bg95o 

About this summary

This summary was written by David Landgren. I chopped a day off this week; it makes it easy to start next week on the first of the month.

  17-23 February 2008
  http://xrl.us/bg95q 

Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: perl5-summary-subscribe@perl.org ). The archive is at http://dev.perl.org/perl5/list-summaries/ . Corrections and comments are welcome.

If you found this summary useful, please consider contributing to the Perl Foundation to help support the development of Perl.


he's not dead, he's, he's restin'!

nicholas on 2008-03-07T21:11:39

EBCDIC was also mentioned in passing. Sadly, Perl no longer runs on EBCDIC due to a general lack of nurturing. Then again, if it was important, Nicholas felt that someone from IBM would have been in touch at some point.

To the best of my knowledge that statement is not true. Specifically, it asserts that it is known that Perl does not run on EBCDIC.

All that can be said is that we know that we have no idea of the state of Perl on EBCDIC, because no-one using EBCDIC sends any feedback whatsoever, positive or negative.

Is there anybody out there? ...

Re:he's not dead, he's, he's restin'!

speters on 2008-04-16T18:09:45

We did hear from the MPE/iX folks a while ago, so, obviously it works there. It might be more correct to say "legacy IBM operating system" since support on z/OS is suspect and support for i5/OS (the OS formerly known as OS/400) is unknown since we've not heard from them in ages.

The problem here is not a lack of trying as it is a lack of access. I'm sure there is someone knowledgeable enough to help get Perl working well on EBCDIC operating system, but its not easy to come by on a desktop. Access to existing servers with those OS's is what we need to be able to support them.