MiniCPAN aging data in

brian_d_foy on 2008-01-31T00:00:00

Jan Dubois suggested that I upload my MiniCPAN aging data to "many eyes", a nifty IBM data visualization project. I upload the data, you make pretty pictures of it and embed it in websites.

I've created the MiniCPAN aging data set and created a "Perl" topic hub. I don't think I need to do any more for you to play with the data. If you have your own data about Perl things, add it to the topic hub.

To make the pretty pictures, you need some Java applet fu in your browser. That doesn't work for me right now and I'm not going to worry about it at the moment. There is a feature to "share" a visualization by embedding some special HTML if you find a picture that you like.

Good luck, :)


# SIMILE

stu42j on 2008-01-31T15:25:07

SIMILE Exhibit is a similar project except it doesn't require Java (and probably doesn't have as many different visualizations).

Another data set

AndyArmstrong on 2008-02-01T02:09:06

I've added a dataset that's scraped from http://search.cpan.org/recent. You can see a line graph of it here.

Naturally it exhibits roughly the same curve as brian's data but with a slightly different shape and slightly more detail.

I think we can safely say the trend is "up" :)

Re:Another data set

brian_d_foy on 2008-02-01T02:28:10

Have you mentioned anywhere that you uploaded the CPAN Testers data? I haven't been able to reach the server for a bit, but it's in the Perl topic hub. I made some bubble charts of it. Now it looks like one of those colorblind tests. I think I have the next cover for The Perl Review :)

Soon I will import that Perl Jobs data too. I was a bit disappointed to not be able to find a way to compute virtual columns in the data or completely replace a data set with all new rows, but once I can get back onto the site I'll find their feedback address :)

Re:Another data set

RGiersig on 2008-02-01T09:09:26

nice hack! now do you think you can extract "new modules" vs. "module updates" from that data? that would be even more interesting... :-)

Re:Another data set

brian_d_foy on 2008-02-01T16:17:43

Yes, I want to look at first time distributions too. That one is a little more tricky because I have to parse the file name (no big deal), and at the same time I want to collect data by author too. :)

Tracking the four horsemen

AndyArmstrong on 2008-02-01T11:26:41

I've just set up a little cron job that tracks the latest updates to the CPAN, Python Cheese Shop, RubyGems and PEAR (PHP). At some point in the future we'll be able to graph daily upload stats for all four.

It'd be nice to be able to go back in time. All available RubyGems are described in a YAML file which includes their release date - so that's easy. I couldn't find a source of historical data for PEAR or Cheese Shop. If anyone can suggest sources for that data I'll go and investigate.

I think really we need a dedicated site that tracks and graphs all this stuff, right? :)

Re:Tracking the four horsemen

AndyArmstrong on 2008-02-01T20:14:05

I managed to reconstruct histories for the other languages - although I'm not certain that some of the Cheese Shop figures aren't the result of double counting.

The resulting graph is here

Re:Tracking the four horsemen

draegtun on 2008-02-04T09:19:09

Very nice.

If u able to put these figures into rolling months then the chart may provide pretty good trend information.

/13az/