UTF8 Goodness!

pudge on 2002-04-19T11:17:01

#!/usr/bin/perl -lw use Digest::MD5 2.16 qw(md5_hex);

$a = "foo\x{100}"; chop($a);

print " : ", md5_hex(""); print "$a: ", md5_hex($a); print "foo: ", md5_hex("foo"); __END__


Results in:

: d41d8cd98f00b204e9800998ecf8427e Use of uninitialized value in subroutine entry at Untitled #6 line 9. foo: d41d8cd98f00b204e9800998ecf8427e foo: acbd18db4cc2f85cedef654fccc4a4d8

Digest::MD5 changed to use SvPVbyte to get the value of the passed SV, instead of SvPV, and it totally broke in perl 5.6. For some reason, the fact that $a has UTF8 flag on makes it blow up. Fun fun fun! Gisle's looking into it; at the very least, maybe it can be changed to use SvPV for < 5.7, but still.

The original test case was using the result of XML::RSS / XML::Parser. Should XML::Parser be flagging SVs UTF8 in the first place, especially if there's no high-bit characters in them?