Web Scrapings (Spider Poop?)

chaoticset on 2003-05-12T06:37:58

Feh upon myself: In an attempt to demonstrate my Perl-fu (only to myself, mind you) I tried to write a simple screen scraper in just a few hours, as step one to something a bit more ambitious (something that would randomly rearrange panels of a webcomic I read and generate "new" comics from those panels).

Of course, I've only ever written one screen-scraper before, one for use.perl. (Actually, it took a URL and a template ID and returned all instances of that ID on that URL. Indicating no ID returned all instances. It wasn't efficient, although it was a good way to start with LWP::Simple, IMHO.)

Tomorrow I'll toss another hour at it somewhere. It's close to working -- most of what's working against me isn't ignorance so much as inconsistent coding on the part of the webcomic author. (Part of the strip's run, they're in a subdir for images, part of it they're just in the main directory; part of the strip's run, wonky formatting prevailed, etc.) It's stuff that I should learn to work around anyway, though. It mirrors real-world conditions to a greater degree than I'd want to think, probably. :(