This prints out a list of recent modules, with no duplicates.
#!/usr/local/bin/perl5.8.0 -- # -*- perl -*- use warnings; use strict; use LWP::Simple; use HTML::TreeBuilder; my $date = shift; # yyyymmdd my $url = "http://search.cpan.org/recent/"; $url .= "?d=$date" if $date; my $recent_html = get($url); my $recent = HTML::TreeBuilder->new_from_content($recent_html); my $links = $recent->extract_links; my %modules; foreach my $link (@$links) { my $linkval = $link->[0]; next unless $linkval =~ m{^/author/.}; $linkval =~ m|/([^/]*)/$|; my $dist = $1; my @dist = split "-", $dist; pop @dist while @dist and $dist[-1] =~ m/^[.\d_]*$/; warn unless @dist; my $module = join "::", @dist; next if $modules{$module}; print "$module\n"; $modules{$module} = 1; }
Update 10:20 A.M., CST: Underscores are allowed in version numbers.
Re:the underscores are only
jdavidb on 2002-11-22T18:14:03
True, and good to point out. The code I have just uniq's module names without regard to version; others might prefer to have them removed from the list.
My purposes were to answer "What's happening on CPAN?" for which alpha modules are a valid answer. If someone uses this to automatically update modules for a NYSE system or something, they might get bad results.
;)
my $recent_html = get($url) || die "Can't get $url\n";
I'm glad to see folks are using new_from_content tho! Proof that it's as handy as I thought it'd be.
Re:another way of doing it
merlyn on 2002-11-23T00:44:46
/me bangs head on desk
If you are mirroring a local mini-CPAN with my program, you can also browse that mirror locally.