A couple of weeks ago I spent some time in London after YAPC::Europe. At an emergency london.pm social meeting we had a few drinks and then Sky started handing out projects to us who had nothing to do.
I got assigned the fun task of cross-referencing CPAN!
Basiclly, it analyses source code - finding what other packages it uses or requires, what subroutines it declares, what functions and methods it calls and what packages it defines. From this information a database is built holding references.
In the web-interface, you can browse the distribution and if a file is indexed you can have a look at the source. When it lists the source, each connection (like a use, call etc) are linked and if you click that link you'll end up in the symbol page. This is where it's getting interesting.
The symbol display page shows in what file(s) the symbol is define as a package, declared as a subroutine or called.
You can also search for symbols which is great if you know the name of a function but you don't know in what distribution/module it is declared in.
Anyway, doing source analysis of Perl-code is extremlly difficult, and it won't handle some of special things one can do. I wish I had a better way to do it than with regular expressions. Only perl can parse Perl!
I plan to have a public demo site up today or later this week. The URL will be posted in my journal.
Hopefully, you guys will find this useful.
/Claes
Adam Kennedy has written PPI which takes a good stab and parsing Perl with perl.
Re:Why use regexes when you can use ...
claes on 2003-08-18T15:51:04
The best parser/tokenizer would be perls. I haven't tried PPI yet, but I sure will.However, I'll release the code for the cross-referencer (and the rest of the project) to CPAN, so I might leave this to the community =)
Re:Module::Info
claes on 2003-08-18T18:41:30
Module::Info can extract some information without loading the module, but to get what I need it has to load it so it's unfortunetly not an option.
/me imagines a machine with all CPAN modules installed ! Re:Module::Info
educated_foo on 2003-08-18T20:51:18
Hence the suggestion to piggyback this on top of CPANTS. To test a module, you have to have enough of an installation to make it run and function. So after running the tests (if they pass), CPANTS could automatically run a function to gather Module::Info data to send back with the results. I'm not sure how bad the politics of such a move would be, but it seems like a natural way to distribute an otherwise expensive computation.
/s