Just when you thought it was (type) safe

brian_d_foy on 2003-12-01T00:52:33

I am writing a little spider application, using LWP and all of that good stuff. For my particular application I need to set the referer header, and along the way I collect the right URLs to put in that.

Since I am using LWP, URLs tend to show up as objects, but when I try to put them back into an HTTP request, things blow-up:

use HTTP::Request;
use URI;

my $url = URI->new( 'http://www.example.com' );

my $request = HTTP::Request->new( "http://www2.example.com" ); $request->referer( $url );


The referer() method comes from HTTP::Headers, and all it does is pass its arguments to the _headers() method. Inside the headers method, that $url ends up in $val, and then it has to run the gauntlet:

[HTTP::Headers, 1.43 sub _headers]
    if (defined($val)) {
	my @new = ($op eq 'PUSH') ? @old : ();
	if (!ref($val)) {
	    push(@new, $val);
	} elsif (ref($val) eq 'ARRAY') {
	    push(@new, @$val);
	} else {
	    Carp::croak("Unexpected field value $val");
	}
	$self->{$lc_field} = @new > 1 ? \@new : $new[0];
    }


The thing in $val is defined, so it makes it into the block, but it is a reference, but not an ARRAY reference, so it falls through to the else{}. This works for most things, because _headers is a generic method, but referer could be a bit smarter.

[HTTP::Headers, 1.43, referer()]
sub referer           { (shift->_header('Referer',          @_))[0] }


Debugging this is was a pain. The URI objects automatically stringify, so printing them just shows the string form, rather than something like "URI=HASH(0xfb748)". My usual debugger, print(), fails to pick this up.

There are a couple of ways around this, none of them satisfying:

  • Interpolate into new strings for each use, i.e. "$url".
  • Check to see if the $url is a reference, then call the as_string method if it is.
  • Always turn things into strings, losing the ability to call methods.


Oh well, now you know. Do not pull your hair out over this one, because I already did.


print and Data::Dumper

mary.poppins on 2003-12-01T13:16:01

I find print much more useful when used with Data::Dumper. Whenever I program in something other than Perl (my job is mostly C++ coding), I find myself missing Data::Dumper, and recreating it in limited ways.

I much prefer visually scanning through Dumper($foo) to clicking through some elaborate tree view in a GUI debugger.

Re:print and Data::Dumper

brian_d_foy on 2003-12-01T18:12:29

For some reason ptkdb was failing with wierd errors when it got to the point of the problem, so I was not using that.

I was using Data::Dumper in a lot of places, but by the time I thought to see what was in the scalar variable (usually not a candidate for a Dumper() call), I knew what the problem was.

Indeed, there were all sorts of signs of what was happening, and everything got clouded because my starting point was wrong: URI objects will always do the right thing with LWP, but that was not the case. :)

Threads doesn't like URI's object stuff

petdance on 2003-12-01T16:42:47

It seems that the autostringification gets hung up on threaded Perls. I had to go through all Mech use of URI and make sure I was explicitly calling ->as_string().

Re:Threads doesn't like URI's object stuff

brian_d_foy on 2003-12-01T18:15:40

Was it the overloading that was the problem, or the things trying to use the objects? I do not use a threaded perl, so I have not paid much attention to its gotchas.

This is fixed in HTTP::Headers 1.47

brian_d_foy on 2003-12-01T19:59:52

Gisle tells me that this was fixed in LWP-5.66, and so it was, at least for my problem.

I thought I had updated LWP when I got home, but that is what I get for thinking. :)