When I look at that what I would call the Perl library at the moment,
the CPAN, I see a whole bunch of modules (more than 14.000 when I
remember my last count), some of them are OO, some are not and we got
several different efforts running at the moment to make the situation,
which is not bad but strange, better.
Thinking about CPAN related projects that is what comes to my mind:
- The Phalanx Project
- CPANPLUS
- CPANTS
- The Core Modules
The problem of all this is that we miss a real structure, something I
will call the Perl Standard Library (PSL) from now on.
See, we got all this beautiful modules which all more or less work
with some version of Perl, some OS and some architecture. That is
nice, as most of them work for me, on a Linux i386 box. Other
platforms are hard to tell, e.g. Windows and compiling something can
be really hard.
The Phalanx project choose 100 modules, as I understand it to make it
more or less the testing platform for Ponie, and try to improve them
to be better modules(TM). CPANPLUS is an effort to make the management
of the installed modules easier. CPANTS and testers.cpan.org make it
more easy for the module developers to see their modules tested on
hardware and version they don't have. The Core Modules (and the bad
dual life modules) make Perl pumpkining harder (I guess so) and
improve the basic functionality that Perl has. The Problem is that
some useful modules are not Core (and maybe some useless modules
are in core :-)). The next thing that is important in connection with
module management is my OS vendor, or my Distribution as it is called
in the Linux world. They bring a couple of modules, some CPAN original
and just build for my version of the OS, or some changed ones, because
a Dead Camel did not move and they needed the lib (as Curses.pm was a
long time, at least SuSE, RedHat and Debian had their own patches
because it didn't compile with 5.8.0). This all ends up in a big
BLOB. I can't really tell by heart which modules are installed on my
system, I'm not sure about every version number and can't promise that
something I wrote works on every system. I don't have nothing to
depend on when writing software, I don't have no Library. In my
opinion a language is only as strong as it's library, but the problem
with the library of my favorite language (guess what) is that it's
library is maintained by over 1000 people and it has nothing you can
rely on.
What I want to say is that we should take a bunch of modules (as the
Phalanx Project does), put the together, make it stable at mark it the
PSL for 5.8.1. I don't have a list of modules I want to see in there,
but I think it would be an advantage for Perl if we would take the
material we got, take the material others contribute (like the modules
which the distro vendors choose) and make something up we could depend
on in eternity (or at least for the time Perl exists). When we have
bundled it we could make another effort do restructure it, as I don't
believe that the CPAN structure would fit all the time. Then we can
improve documentation and make everything more homogeneous. One of the
other problems with the CPAN is that there is no document I can think
of that says how to do what with which module. There are many books
out there that recommend many modules, but there is no single "Using
CPAN" book out there that covers all of CPAN.
To make myself clear again, what I would like would be a PSL project,
that provides a big, stable and maybe pure OO library for Perl,
version dependent, with easy to access documentation and that has
obvious responsibilies.
This is just an idea, maybe I have to restructure some parts and think
over others.
Another thing that came to my life lately was that another thing I'm
missing at the moment is easier to install modules. I installed
ID3Lib.pm today just to recognize that it didn't compile because I had
no ID3Lib installed. This is rather obvious but often it's hard to
tell which libraries a Perl module depends on. Maybe this is something
for the META files. The thing that I really didn't like was that the
lib was just a line away (on my Debian) it's just 'apt-get install
libid3'. I'd like to have something like CPANPLUS::Distro::Backend
that would manage such thing for me. It's not very realistic to get
something like that for every system, but if you can say the developer
install lib foo it's more than just a missing include file. Would be
some kind of community effort, but maybe it is possible.
Enough revolutionary ideas for today :-)
You will want to be visiting the graves of those who have come before you to honor those who have fallen attempting to create anything standard out of Perl.
Any attempt to create a list of "standard" modules for Perl will fail. Why? What's "standard"? Presumably you'll want the standard library to cover the things standard Perl users try to do. Who is a "standard" Perl user? What do people typically use Perl for? The answer is, of course, everything. Everything from one-liners to hundred thousand line applications that control billions of dollars. So you have to package everything or risk alienating one group or another.
The other problem is backwards compatibility. Once a module is in the standard library, can it be removed? No, because its a standard. Things change, rapidly. Look at the modules we ship with 5.6.0 and pick off how many are really used and how much is just historical cruft. Even CGI.pm probably wouldn't make it into the core if it was introduced today, the CGI fad having worn off. Any Technology Du Jour added to a standard library cannot be removed after it becomes passe. Standard libraries can only become larger. After a certain point, size becomes more important than tracking new, useful "standard" technologies and the standard lib starts falling behind. Look at the ANSI C standard library. No graphics. No web. Very little to do with the Internet at all. No regexes. No decent memory allocation.
Creating a big list of modules will never work. Here's what might.
"Best of breed" lists. A common use of CPAN is a user comes along and says "I need a module to do X" and then either can't find one because its not called what he/she expects or finds five of them and doesn't know which one to use. Developing answers to those common questions is a good way to solve a real problem. The trick is finding out what the questions are. Even relatively innocent attempts like, "What's the best XML parser on CPAN?" are fraught with peril as the perl-sdk folks found out. Its too simplistic a question since there are many different types of XML parsers and many different ways they're used. You have to avoid favoring one use style over another (DOM vs SAX) or one use case over another (CPU vs memory vs programmer efficiency) or weighing portability (a pure Perl module) vs speed (a libxml based module). And then there are those of us who don't care about XML.
Even best of breed lists fail because they're lists and inevitably favor one set of module attributes over another. Program speed vs portability. OO vs functional. Finding "the best" is meaningless in a zero sum game. So an evolution of the best of breed list is module metadata. Extend the module description with all the information a person might need to make an intellegent decision of which module to use based on their requirements. And do it in a way that the user can easily translate their desires into a search query. I don't know how to do this.
Finally, there is one list of standard modules that will work. Modules to get more modules. Modules that make working with CPAN easier. MakeMaker, Module::Build, CPANPLUS, Archive::Tar, Compress::Zlib, Test::More, etc... In the face of "Perl is used for everything" this is really the only list of core modules that makes sense. The only thing we know everyone is going to use Perl for is to get modules. This looks like its becoming the new standard for core module inclusion.
PS There's about 5500 modules on CPAN, not 14000 (there might be 14000 including old versions, though).
Re:"They tried and failed?" "They tried and died!"
hfb on 2003-10-28T08:49:10
Actually scwhern, it's 5500 DISTRIBUTIONS which contain one or more modules.
The idiots who rip on Jarkko for being the worst pumpkin in the history of perl for pumping up the core with so many modules and the others who think it's ok. These former are not so easily found since they tend to hide under rocks, but the drawback of having so many ways to do something and on CPAN is that you're screwed any way you try to do something because some asshole is going to think some other way, their way, is better. So, fuck them and the horse they rode in on...let them install their own damn modules.
An SDK would require incredible amounts of testing and packaging for binary distributions. At one point in time Kurt Starsnick was going to set up a test lab for such things but I think the whole idea just died due to lack of enthusiasm.
Also, you might find the CPAN FAQ, the one noone ever reads, helpful for figuring out how to find out what modules are on your system, etc.