By processing an ls -lR on http://backpan.cpan.org/authors/ helpfully created by Elaine, I slapped together some interesting statistics about the growth of CPAN. These are counts of archive file uploads (ie. a .tar.gz or .zip or whatever) including new versions of existing modules.
I'm just looking by file extension, so I've picked up lots of crap that may or may not be code.
These types counted: gz tar tgz zip bz2 lsm par bin hqx patch-gz tar-gz pat-gz sit Z lzh
Its interesting to note that according to Ziggy, almost 50% of the brand new modules on CPAN were uploaded in 2003. However if you look at the uploads per month below there's nothing like a corresponding upload increase in 2003. Its a nice, smooth, logarithic increase. This means over time either we're creating new modules faster than we're maintaining the existing ones OR we just got a huge influx of new authors OR Ziggy's wrong.
And if you're wondering, the first upload to CPAN appears to have been http://backpan.cpan.org/authors/id/T/TO/TOMC/scripts/boxit.gz on August 21, 1991. However, Elaine tells me CPAN didn't exist then so the dates on some of the early entries may be wrong.
Total archives on CPAN by year
--------------------------
1991
1992
1993
1994
1995
1996 **
1997 *****
1998 **********
1999 ****************
2000 ***********************
2001 **********************************
2002 *************************************************
2003 ***************************************************************
31609 archives counted
Each * == 500 archives
Total archives on CPAN by month
----------------------------
1991 07 0
1991 08 1
1991 09 1
1991 10 1
1991 11 1
1991 12 1
1992 01 1
1992 02 2
1992 03 4
1992 04 4
1992 05 4
1992 06 5
1992 07 6
1992 08 6
1992 09 6
1992 10 6
1992 11 6
1992 12 6
1993 01 6
1993 02 6
1993 03 6
1993 04 6
1993 05 8
1993 06 8
1993 07 8
1993 08 9
1993 09 9
1993 10 9
1993 11 9
1993 12 9
1994 01 9
1994 02 9
1994 03 9
1994 04 10
1994 05 10
1994 06 10
1994 07 10
1994 08 39
1994 09 39
1994 10 42
1994 11 66
1994 12 66
1995 01 68
1995 02 68
1995 03 68
1995 04 72
1995 05 118
1995 06 119
1995 07 119
1995 08 170
1995 09 191
1995 10 240
1995 11 298
1995 12 363
1996 01 400
1996 02 425
1996 03 463
1996 04 504 *
1996 05 577 *
1996 06 652 *
1996 07 710 *
1996 08 768 *
1996 09 846 *
1996 10 954 *
1996 11 1028 **
1996 12 1120 **
1997 01 1249 **
1997 02 1365 **
1997 03 1500 ***
1997 04 1604 ***
1997 05 1717 ***
1997 06 1831 ***
1997 07 1934 ***
1997 08 2091 ****
1997 09 2243 ****
1997 10 2424 ****
1997 11 2538 *****
1997 12 2689 *****
1998 01 2890 *****
1998 02 3096 ******
1998 03 3313 ******
1998 04 3445 ******
1998 05 3586 *******
1998 06 3766 *******
1998 07 4166 ********
1998 08 4368 ********
1998 09 4591 *********
1998 10 4845 *********
1998 11 5087 **********
1998 12 5257 **********
1999 01 5446 **********
1999 02 5656 ***********
1999 03 5936 ***********
1999 04 6144 ************
1999 05 6312 ************
1999 06 6547 *************
1999 07 6793 *************
1999 08 7057 **************
1999 09 7360 **************
1999 10 7583 ***************
1999 11 7830 ***************
1999 12 8030 ****************
2000 01 8280 ****************
2000 02 8561 *****************
2000 03 8867 *****************
2000 04 9165 ******************
2000 05 9488 ******************
2000 06 9756 *******************
2000 07 10040 ********************
2000 08 10347 ********************
2000 09 10674 *********************
2000 10 10965 *********************
2000 11 11262 **********************
2000 12 11546 ***********************
2001 01 11910 ***********************
2001 02 12302 ************************
2001 03 12763 *************************
2001 04 13171 **************************
2001 05 13625 ***************************
2001 06 14162 ****************************
2001 07 14635 *****************************
2001 08 15132 ******************************
2001 09 15557 *******************************
2001 10 16122 ********************************
2001 11 16752 *********************************
2001 12 17228 **********************************
2002 01 17844 ***********************************
2002 02 18355 ************************************
2002 03 18965 *************************************
2002 04 19608 ***************************************
2002 05 20206 ****************************************
2002 06 20799 *****************************************
2002 07 21458 ******************************************
2002 08 22195 ********************************************
2002 09 22932 *********************************************
2002 10 23531 ***********************************************
2002 11 24187 ************************************************
2002 12 24791 *************************************************
2003 01 25568 ***************************************************
2003 02 26258 ****************************************************
2003 03 27153 ******************************************************
2003 04 27963 *******************************************************
2003 05 28795 *********************************************************
2003 06 29704 ***********************************************************
2003 07 30630 *************************************************************
2003 08 31539 ***************************************************************
31539 archives counted
Each * == 500 archives
Uploads to CPAN by month
-----------------------
1991 07 0
1991 08 1
1991 09 0
1991 10 0
1991 11 0
1991 12 0
1992 01 0
1992 02 1
1992 03 2
1992 04 0
1992 05 0
1992 06 1
1992 07 1
1992 08 0
1992 09 0
1992 10 0
1992 11 0
1992 12 0
1993 01 0
1993 02 0
1993 03 0
1993 04 0
1993 05 2
1993 06 0
1993 07 0
1993 08 1
1993 09 0
1993 10 0
1993 11 0
1993 12 0
1994 01 0
1994 02 0
1994 03 0
1994 04 1
1994 05 0
1994 06 0
1994 07 0
1994 08 29 *
1994 09 0
1994 10 3
1994 11 24 *
1994 12 0
1995 01 2
1995 02 0
1995 03 0
1995 04 4
1995 05 46 ***
1995 06 1
1995 07 0
1995 08 51 ***
1995 09 21 *
1995 10 49 ***
1995 11 58 ***
1995 12 65 ****
1996 01 37 **
1996 02 25 *
1996 03 38 **
1996 04 41 **
1996 05 73 ****
1996 06 75 *****
1996 07 58 ***
1996 08 58 ***
1996 09 78 *****
1996 10 108 *******
1996 11 74 ****
1996 12 92 ******
1997 01 129 ********
1997 02 116 *******
1997 03 135 *********
1997 04 104 ******
1997 05 113 *******
1997 06 114 *******
1997 07 103 ******
1997 08 157 **********
1997 09 152 **********
1997 10 181 ************
1997 11 114 *******
1997 12 151 **********
1998 01 201 *************
1998 02 206 *************
1998 03 217 **************
1998 04 132 ********
1998 05 141 *********
1998 06 180 ************
1998 07 400 **************************
1998 08 202 *************
1998 09 223 **************
1998 10 254 ****************
1998 11 242 ****************
1998 12 170 ***********
1999 01 189 ************
1999 02 210 **************
1999 03 280 ******************
1999 04 208 *************
1999 05 168 ***********
1999 06 235 ***************
1999 07 246 ****************
1999 08 264 *****************
1999 09 303 ********************
1999 10 223 **************
1999 11 247 ****************
1999 12 200 *************
2000 01 250 ****************
2000 02 281 ******************
2000 03 306 ********************
2000 04 298 *******************
2000 05 323 *********************
2000 06 268 *****************
2000 07 284 ******************
2000 08 307 ********************
2000 09 327 *********************
2000 10 291 *******************
2000 11 297 *******************
2000 12 284 ******************
2001 01 364 ************************
2001 02 392 **************************
2001 03 461 ******************************
2001 04 408 ***************************
2001 05 454 ******************************
2001 06 537 ***********************************
2001 07 473 *******************************
2001 08 497 *********************************
2001 09 425 ****************************
2001 10 565 *************************************
2001 11 630 ******************************************
2001 12 476 *******************************
2002 01 616 *****************************************
2002 02 511 **********************************
2002 03 610 ****************************************
2002 04 643 ******************************************
2002 05 598 ***************************************
2002 06 593 ***************************************
2002 07 659 *******************************************
2002 08 737 *************************************************
2002 09 737 *************************************************
2002 10 599 ***************************************
2002 11 656 *******************************************
2002 12 604 ****************************************
2003 01 777 ***************************************************
2003 02 690 **********************************************
2003 03 895 ***********************************************************
2003 04 810 ******************************************************
2003 05 832 *******************************************************
2003 06 909 ************************************************************
2003 07 926 *************************************************************
2003 08 909 ************************************************************
31539 archives counted
Each * == 15 archives
First, we need to remember that there are lies, statistics and misinterpreted statistics.Its interesting to note that according to Ziggy, almost 50% of the brand new modules on CPAN were uploaded in 2003.
The numer I came up with is from analyzing the modules list. This includes some duplication (mod_perl is listed twice; there are three distributions of AcePerl at the top of the list) and some exclusions (perl-5.8.x.tar.gz is not listed, nor is Meta).
Analyzing the modules list cannot show the growth of CPAN, but can be used as a first approximation on the freshness of CPAN. I did not say that ~50% of the new modules on CPAN were created since January 2003. I did say that 44% (quoting from memory) of CPAN modules listed on the modules list were created or modified at least once since January 2003. The modules list cannot show the former. It can give an indication of the latter.
The modules list does not (in general) reflect prior distributions of current modules (either currently on CPAN or deleted by the author) that would be found by analyzing an ls-lR of CPAN or backpan. Groveling over an ls-lR dump of either will find things that aren't modules (and arguably should not be counted), while the modules list will find too little. I find the modules list to provide a good first approximation, no more, no less.
OR you're misinterpreting the statement "almost 50% of current module distributions on CPAN have been created or modified since January 2003".This means over time either we're creating new modules faster than we're maintaining the existing ones OR we just got a huge influx of new authors OR Ziggy's wrong.
The difference is rather significant.
My intent was to see if Perl was stagnating, or if people still care about Perl. Using uploads to CPAN as a proxy, I found that ~85% of what's current on CPAN (as listed in the modules list) was created or modified after the Perl6 announcement.
We can argue over methodologies and precise figures, but I assert that the intent is sound, and these figures serve as a good first approximation -- no more, no less.
The conclusion that I draw from these numbers is that as a community, we have not given up on Perl 5, nor have we stopped caring about Perl 5 since the Perl 6 announcement. I didn't expect anyone to infer that people stopped supporting old modules, or that the list of CPAN authors increased geometrically.
Re:CPAN Stats
schwern on 2003-10-20T04:27:44
OR you're misinterpreting the statement "almost 50% of current module distributions on CPAN have been created or modified since January 2003".OR I got my information second hand and it was misquoted.
:) Sorry, entirely my mistake. Teach me not to check my sources. I hope you didn't mistake the tone of my writing to imply I was trying to one up your stats. The ls -lR stats and your module list stats make a hell of a lot more sense now
Re:CPAN Stats
hfb on 2003-10-20T07:10:57
Do that computation again, but this time leave off the Acme namespace and use the ls-lR since the modules list is incomplete and still manually maintained.
The conclusion is obvious, if not terribly important, but statistics lie too often, especially around those who like pie charts. CPAN is the only thing that is doing well in the perl world right now...you don't need a bar graph to tell anyone that.
Re:CPAN Stats
ziggy on 2003-10-20T14:19:24
That's a good idea. Thanks.Do that computation again, but this time leave off the Acme namespaceThe ls-lR for either CPAN or BACKPAN doesn't suit my needs at the moment. Eventually, I want to do a more detailed analysis of BACKPAN, but not this week. Thanks for making that ls-lR.use the ls-lR since the modules list is incomplete and still manually maintained.I'm glad you think the conclusion is obvious. My intent here is to separate the lies we tell ourselves from what's really going on. Parroting received wisdom ("CPAN is the only thing that is doing well in Perl", "You can do more with Perl because it is «more expressive»", "Perl 6 is the future of Perl") doesn't do anyone any favors.The conclusion is obvious, if not terribly important, but statistics lie too often, especially around those who like pie charts. CPAN is the only thing that is doing well in the perl world right now...you don't need a bar graph to tell anyone that.Re:CPAN Stats
hfb on 2003-10-20T14:50:17
Parroting received wisdom ("CPAN is the only thing that is doing well in Perl", "You can do more with Perl because it is «more expressive»", "Perl 6 is the future of Perl") doesn't do anyone any favors.
Parroting? You say that like we don't pay attention to CPAN. We do, but we don't publish statistics just because we run the joint. Publishing inaccurate and misleading statistics don't do anyone any favours most of all. Statistics are the tool with which people lie to others as well as to themselves. An example would be download statistics from across the mirrors which would make a lot of people happy and generate quite a few exciting numbers, but the numbers would lie since they wouldn't be the whole story, would they? Just the same for using the file timestamp to determine freshness on a module within a distribution....all it means is someone uploaded a distribution, not that they updated anything more than a typo.
A sociologist would be reluctant to make such conclusions or assumptions based on the results from shoving a few files into a script which generated lots of numbers. People aren't so easily qualified by the quantity of things.
Re:CPAN Stats
ziggy on 2003-10-20T15:26:15
No, I say that because a lot of pro-Perl scentiments are repeated endlessly without any critical analysis. If you actually read what I said, you'd see that I'm not singling out CPAN, nor am I not accusing CPAN's maintainers of not paying attention.You say that like we don't pay attention to CPAN.I also never attributed any mystical properties to an uploaded distribution, so please don't say that I did. As I said before, the only thing an upload means is that someone created or updated a file on CPAN. Period. I never said that an analysis of the modules list or ls-lR could tell the whole story of Perl or CPAN on its own.
But if you really want to have a fight about lies, damned lies and statistics, be my guest. I really don't care how many CPAN modules can fit on the head of a pin, or how many sociologists could make a career debating CPAN statistics.
Re:CPAN Stats
hfb on 2003-10-20T21:18:32
No, but you were taking the conclusion and generating numbers rather casually to prop it up which isn't critical analysis either.
And I wouldn't call my earlier conclusion positive or 'pro-perl' either.
Re:CPAN Stats
ziggy on 2003-10-20T21:26:27
Whatever.Pick a fight if you must. I'm just trying to see what's going on.
Re:CPAN Stats
hfb on 2003-10-20T21:33:34
I'm not picking a fight, just saying you shouldn't take a statement and try to make it legit by mostly unrelated numbers with a weak methodology...
You always gotta get the last word in, don't you?
Re:CPAN Stats
schwern on 2003-10-21T13:27:23
No, I do.;P Re:CPAN Stats
hfb on 2003-10-21T19:26:13
Schwern, you hunky bit of manmeat, don't quit your day job for a career in komedee:)