I want to see how much of the stuff in my MiniCPAN was uploaded when:
$ perl dir_sizes.pl /MINICPAN ---------------------------------------------------------------------------------------------------- Year | Total | Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ---------------------------------------------------------------------------------------------------- 1995 | 285KB | - - - - 2KB - - 93KB 16KB 12KB 44KB 115KB 1996 | 1MB | 25KB 75KB 26KB 49KB 800KB 173KB 11KB 95KB - 44KB 53KB 335KB 1997 | 9MB | 48KB 157KB 127KB 237KB 236KB 274KB 248KB 244KB 7MB 71KB 220KB 521KB 1998 | 8MB | 224KB 2MB 493KB 554KB 36KB 631KB 968KB 1MB 327KB 1MB 896KB 193KB 1999 | 6MB | 325KB 287KB 276KB 1MB 299KB 255KB 641KB 670KB 884KB 425KB 1MB 398KB 2000 | 8MB | 1MB 419KB 855KB 630KB 286KB 1MB 532KB 662KB 544KB 466KB 487KB 889KB 2001 | 14MB | 951KB 2MB 890KB 616KB 847KB 1MB 888KB 1MB 1MB 1MB 1MB 1MB 2002 | 56MB | 1MB 6MB 3MB 3MB 3MB 3MB 1MB 3MB 8MB 3MB 1MB 15MB 2003 | 85MB | 2MB 2MB 12MB 2MB 2MB 4MB 7MB 6MB 14MB 13MB 3MB 13MB 2004 | 74MB | 4MB 4MB 3MB 10MB 4MB 3MB 3MB 7MB 6MB 3MB 16MB 5MB 2005 | 81MB | 4MB 3MB 13MB 7MB 4MB 6MB 4MB 4MB 7MB 11MB 7MB 4MB 2006 | 117MB | 7MB 3MB 4MB 8MB 8MB 10MB 5MB 12MB 10MB 21MB 20MB 4MB 2007 | 257MB | 18MB 14MB 24MB 10MB 19MB 15MB 20MB 20MB 18MB 22MB 28MB 43MB 2008 | 64MB | 64MB - - - - - - - - - - - ---------------------------------------------------------------------------------------------------- Processed 14557 files in 3 seconds, 0.00021 secs/file $VAR1 = { 'zip' => 108, 'tgz' => 237, 'gz' => 14212 };
I'm working on doing the same thing for BackPAN and CPAN, but have a couple of data collection issues to work out. I'm also working on doing the same thing for numbers of authors each month, numbers of new authors each month, and so on. When I get this worked out I'll release the code, but it's relly just File::Find and some path processing and counting.
Re:Surface plot?
brian_d_foy on 2008-01-30T20:25:06
Well, I also want to see yearly cycles, and look at slices like "December". People can make all sorts of graphs though.
Note that your graph is really "Age of distros in MiniCPAN", not CPAN Uploads. The only things there are the lastest ditros, which is why January 2008 has such a big spike.:) Re:Surface plot?
jdavidb on 2008-01-30T21:56:26
Yeah; to me it's really cool that 8.2% of CPAN by size has been released or revised in the last month. And 38.8% of it within the last year. Most modules people are running are something that is new and fairly recently looked at, especially if it has a good test suite.
:) Of course, I'm really curious what the gigantic distribution is in September 1997 that hasn't been touched in the last ten years.
Re:Surface plot?
brian_d_foy on 2008-01-30T22:20:04
Don't get into the numbers too quickly. This is just MiniCPAN, so that excludes Perl distributions and so on. CPAN has a lot more current stuff than what shows up in MiniCPAN. It's not something to measure to tenths of a percent.