It took me far longer than I thought it would to come up with this code that grabs a web page and stuffs all the page's hyperlinks into a text file.
Updated...
use strict;
use WWW::Mechanize;
#usage perl linkextractor.pl http://www.example.com/ > output.txt
my $url = shift;
my $mech = WWW::Mechanize->new();
$mech->get($url);
my $status=$mech->status();
print $status." OK-URL request succeeded."."\n";
my @links = $mech->links;
print STDOUT ($_->url, $/) foreach $mech->links;
<\code>
No need to work too hard, after all.linktractor -f=http://www.example.com > output.txt
Re:Try linktractor
scot on 2007-01-03T17:03:06
Thank you. Can you see any obvious snafu's in the following code?:
use strict;
use warnings;
use HTML::SimpleLinkExtor;
use WWW::Mechanize qw( );
#usage linkextractor -f http://www.example.com/ > output.txt
my ($url) = @ARGV;
my $mech = WWW::Mechanize->new();
my $response = $mech->get($url);
$response->is_success()
or die($response->status_line() . "\n");
my $extor = HTML::SimpleLinkExtor->new();
$extor->parse($response);
my @all_links = $extor->links;
foreach my $elem (@all_links) {
print STDOUT;
}Re:Try linktractor
brian_d_foy on 2007-01-03T17:21:46
I'm not really sure where to start with that or if you're serious, considering the code doesn't work.Re:Try linktractor
scot on 2007-01-03T22:46:54
Please disregard that last post of mine. My apologies.