The Perl Standard Library

marcus on 2003-10-27T22:13:04

When I look at that what I would call the Perl library at the moment, the CPAN, I see a whole bunch of modules (more than 14.000 when I remember my last count), some of them are OO, some are not and we got several different efforts running at the moment to make the situation, which is not bad but strange, better.



Thinking about CPAN related projects that is what comes to my mind:
- The Phalanx Project
- CPANPLUS
- CPANTS
- The Core Modules

The problem of all this is that we miss a real structure, something I will call the Perl Standard Library (PSL) from now on.

See, we got all this beautiful modules which all more or less work with some version of Perl, some OS and some architecture. That is nice, as most of them work for me, on a Linux i386 box. Other platforms are hard to tell, e.g. Windows and compiling something can be really hard.

The Phalanx project choose 100 modules, as I understand it to make it more or less the testing platform for Ponie, and try to improve them to be better modules(TM). CPANPLUS is an effort to make the management of the installed modules easier. CPANTS and testers.cpan.org make it more easy for the module developers to see their modules tested on hardware and version they don't have. The Core Modules (and the bad dual life modules) make Perl pumpkining harder (I guess so) and improve the basic functionality that Perl has. The Problem is that some useful modules are not Core (and maybe some useless modules are in core :-)). The next thing that is important in connection with module management is my OS vendor, or my Distribution as it is called in the Linux world. They bring a couple of modules, some CPAN original and just build for my version of the OS, or some changed ones, because a Dead Camel did not move and they needed the lib (as Curses.pm was a long time, at least SuSE, RedHat and Debian had their own patches because it didn't compile with 5.8.0). This all ends up in a big BLOB. I can't really tell by heart which modules are installed on my system, I'm not sure about every version number and can't promise that something I wrote works on every system. I don't have nothing to depend on when writing software, I don't have no Library. In my opinion a language is only as strong as it's library, but the problem with the library of my favorite language (guess what) is that it's library is maintained by over 1000 people and it has nothing you can rely on.

What I want to say is that we should take a bunch of modules (as the Phalanx Project does), put the together, make it stable at mark it the PSL for 5.8.1. I don't have a list of modules I want to see in there, but I think it would be an advantage for Perl if we would take the material we got, take the material others contribute (like the modules which the distro vendors choose) and make something up we could depend on in eternity (or at least for the time Perl exists). When we have bundled it we could make another effort do restructure it, as I don't believe that the CPAN structure would fit all the time. Then we can improve documentation and make everything more homogeneous. One of the other problems with the CPAN is that there is no document I can think of that says how to do what with which module. There are many books out there that recommend many modules, but there is no single "Using CPAN" book out there that covers all of CPAN.

To make myself clear again, what I would like would be a PSL project, that provides a big, stable and maybe pure OO library for Perl, version dependent, with easy to access documentation and that has obvious responsibilies.

This is just an idea, maybe I have to restructure some parts and think over others.

Another thing that came to my life lately was that another thing I'm missing at the moment is easier to install modules. I installed ID3Lib.pm today just to recognize that it didn't compile because I had no ID3Lib installed. This is rather obvious but often it's hard to tell which libraries a Perl module depends on. Maybe this is something for the META files. The thing that I really didn't like was that the lib was just a line away (on my Debian) it's just 'apt-get install libid3'. I'd like to have something like CPANPLUS::Distro::Backend that would manage such thing for me. It's not very realistic to get something like that for every system, but if you can say the developer install lib foo it's more than just a missing include file. Would be some kind of community effort, but maybe it is possible.

Enough revolutionary ideas for today :-)




"They tried and failed?" "They tried and died!"

schwern on 2003-10-28T05:12:00

You will want to be visiting the graves of those who have come before you to honor those who have fallen attempting to create anything standard out of Perl.

Any attempt to create a list of "standard" modules for Perl will fail. Why? What's "standard"? Presumably you'll want the standard library to cover the things standard Perl users try to do. Who is a "standard" Perl user? What do people typically use Perl for? The answer is, of course, everything. Everything from one-liners to hundred thousand line applications that control billions of dollars. So you have to package everything or risk alienating one group or another.

The other problem is backwards compatibility. Once a module is in the standard library, can it be removed? No, because its a standard. Things change, rapidly. Look at the modules we ship with 5.6.0 and pick off how many are really used and how much is just historical cruft. Even CGI.pm probably wouldn't make it into the core if it was introduced today, the CGI fad having worn off. Any Technology Du Jour added to a standard library cannot be removed after it becomes passe. Standard libraries can only become larger. After a certain point, size becomes more important than tracking new, useful "standard" technologies and the standard lib starts falling behind. Look at the ANSI C standard library. No graphics. No web. Very little to do with the Internet at all. No regexes. No decent memory allocation.

Creating a big list of modules will never work. Here's what might.

"Best of breed" lists. A common use of CPAN is a user comes along and says "I need a module to do X" and then either can't find one because its not called what he/she expects or finds five of them and doesn't know which one to use. Developing answers to those common questions is a good way to solve a real problem. The trick is finding out what the questions are. Even relatively innocent attempts like, "What's the best XML parser on CPAN?" are fraught with peril as the perl-sdk folks found out. Its too simplistic a question since there are many different types of XML parsers and many different ways they're used. You have to avoid favoring one use style over another (DOM vs SAX) or one use case over another (CPU vs memory vs programmer efficiency) or weighing portability (a pure Perl module) vs speed (a libxml based module). And then there are those of us who don't care about XML. ;)

Even best of breed lists fail because they're lists and inevitably favor one set of module attributes over another. Program speed vs portability. OO vs functional. Finding "the best" is meaningless in a zero sum game. So an evolution of the best of breed list is module metadata. Extend the module description with all the information a person might need to make an intellegent decision of which module to use based on their requirements. And do it in a way that the user can easily translate their desires into a search query. I don't know how to do this.

Finally, there is one list of standard modules that will work. Modules to get more modules. Modules that make working with CPAN easier. MakeMaker, Module::Build, CPANPLUS, Archive::Tar, Compress::Zlib, Test::More, etc... In the face of "Perl is used for everything" this is really the only list of core modules that makes sense. The only thing we know everyone is going to use Perl for is to get modules. This looks like its becoming the new standard for core module inclusion.

PS There's about 5500 modules on CPAN, not 14000 (there might be 14000 including old versions, though).

Re:"They tried and failed?" "They tried and died!"

hfb on 2003-10-28T08:49:10

Actually scwhern, it's 5500 DISTRIBUTIONS which contain one or more modules.

there are two camps

hfb on 2003-10-28T09:00:44

The idiots who rip on Jarkko for being the worst pumpkin in the history of perl for pumping up the core with so many modules and the others who think it's ok. These former are not so easily found since they tend to hide under rocks, but the drawback of having so many ways to do something and on CPAN is that you're screwed any way you try to do something because some asshole is going to think some other way, their way, is better. So, fuck them and the horse they rode in on...let them install their own damn modules.

An SDK would require incredible amounts of testing and packaging for binary distributions. At one point in time Kurt Starsnick was going to set up a test lab for such things but I think the whole idea just died due to lack of enthusiasm.

Also, you might find the CPAN FAQ, the one noone ever reads, helpful for figuring out how to find out what modules are on your system, etc.