After Mark's post yesterday on how atom aggregators should handle http response codes, I decided to bring our (currently in-house-only) RSS aggregator up to speed. I think it's safe to assume that both RSS and atom aggregators should handle HTTP in the same manner.
The structure of the aggregator is a little weird because it's a server-based "what's new?" type app and not a desktop/personal news reader.
There are two main parts to the app that read a feed.
So, i really only had to deal with the update_cache.pl script since it's the only part that actually deals with external sites.
I was able to check off a few of the requirements off right away.
Adding gzip and deflate support was a snap -- along with sending the right header with my request
$request->header( Accept_encoding => 'gzip; deflate' );
handling the returned data was too easy:
if ( my $encoding = $response->header( 'Content-Encoding' ) ) {
require Compress::Zlib;
$data = Compress::Zlib::memGunzip( $data ) if $encoding =~ /gzip/i;
$data = Compress::Zlib::uncompress( $data ) if $encoding =~ /deflate/i;
}
LWP was handling redirects automatically which was alright except that there needs to be a distinction between temporary redirects (300, 302, 307) and permanent ones (301). I had to add in a loop and use simple_request()
so i could see what was returned after each request. After a permanent redirect, the URL will be permanently updated in the config file (temp redirects will not affect the config).
Speaking of redirects, code 304 Not modified is listed under is_redirect()
-- which is a little misleading (but true to the specs).
I didn't make any special rules for any codes that fall under is_error()
. It currently only reports that an error occurred -- I haven't decided how much of Mark's recommendation I want to follow for this app.
Other than that, I still need to look at authentication -- basic auth. is handled in the URL only, but that poses a security risk since i use URL as a key, stored in a user-cookie for customization purposes -- and proxy support.
Mark put up some tests today. Sadly there are only atom based feeds so, i can't try to parse the data, but the status codes that they return seem useful.