Today I've been thinking about what to talk in YAPC::EU (and OSCON if they're short of Perl talks, I'm not sure), and came up with a few hours of hacking with web-content scraping module using Domain Specific Languages.
With help from guys on IRC channel and obra who gave a nice talk about DSL in Perl at YAPC::Asia, I whipped up a really small Web::Scraper module.
This is basically a Perl port of Ruby's scrapi toolkit and its API is intended to be similar to ruby's one. So you can write a script to parse Twitter's friend list and extract image URLs for them as:
use URI; use Web::Scraper;
my $nick = shift || "miyagawa"; my $uri = URI->new("http://twitter.com/$nick");
my $twitter = scraper { process 'a[rel="contact"]', 'friends[]' => scraper { process 'a', url => '@href', name => '@title'; process 'img', src => '@src'; }; result 'friends'; };
my $friends = $twitter->scrape($uri);
use YAML; warn Dump $friends;
Re:Where are obra's slides
miyagawa on 2007-05-11T07:30:35
The slides are linked from http://tokyo2007.yapcasia.org/wiki/?SlidesFromTalks, particularly: http://svn.jifty.org/svn/jifty.org/jifty/trunk/doc/talks/yapc.asia.2007.txt