Kevin Falcone and I have been working on encoding problems with Mechanize. As many people as can test the darn thing would be appreciated.
Also, suggestions on getting rid of the warnings in the test suite are welcome as well.
Thanks,
xoxo,
Andy
(and Kevin)
Download at $CPAN/authors/id/P/PE/PETDANCE/WWW-Mechanize-1.29_01.tar.gz
Here's what goes wrong:
t/live/wikipedia.........ok 5/15Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
t/live/wikipedia.........ok 10/15Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83.
t/live/wikipedia.........ok
So far, that's worked great for me, but then I have the luxury of working almost exclusively in UTF-8 with trustworthy content.if ($response->header('Content-Type') &&
$response->header('Content-Type') =~ m/charset=(\S+)/xms) {
my $encoding = $1;
$response->content(Encode::decode($encoding, $response->content));
}
but then I do run pretty aggressive firewall settings on my Mac, so it might be me...t/local/back.............NOK 28/38
# Failed test '404 check'
# at t/local/back.t line 149.
# got: '500'
# expected: '404'
[... snip...]
t/local/overload.........skipped
all skipped: Mysteriously stopped passing, and I don't know why.
[... snip...]
Failed Test Stat Wstat Total Fail List of Failed
------------------------------------------------------------------------ -------
t/local/back.t 1 256 38 1 28
1 test skipped.
Re:My solution
grantm on 2007-05-23T10:10:55
Wouldn't it be better to just use the decoded_content method of the HTTP::Message (Response) object? Or is there some problem with that?
Re:My solution
ChrisDolan on 2007-05-23T12:36:35
Oh, uh, yeah... The significant problem with the function is my ignorance of its existence.:-/ Re:My solution
jibsheet on 2007-05-23T17:04:44
unfortunately, that makes t/local/nonascii.t break, and that has been a test that worked with mech for years. (it is derived from a number of tests in the RT testsuite)