The Real CPAN Limitation

Ovid on 2004-11-17T17:04:00

I'll shortly be leaving my current employer to go work for Kineticode. In the year and a half I've been working here, I've learned some valuable lessons. Perhaps the most important is my first exposure to what can truly be termed "enterprise-class Perl." This isn't my first exposure to coding on such a large scale, but COBOL is such a foreign world that it doesn't count (it was successful partially because of its limitations, not in spite of them.)

What did I learn about large-scale programming with Perl? I learned that I like Ruby. I am no longer convinced that Perl is the best dynamic language for such projects. You've heard the arguments before -- I mostly focus on the poor OO model -- but the strongest counter-argument that I hear is "the CPAN." That's a pretty strong counter-argument. Yet counterintuitively, it's far less persuasive for large-scale projects because those projects are to solve a crucial business problem and that problem is not a reimplementation of CGI.pm.

If I need a general purpose tool that solves a wide range of problems, Perl is an excellent choice, due in large part to the CPAN. However, the core of a large system, the value of a large system, is not in CPAN modules. If it were, everyone could just go out and download them and instantly make money. That is not how the real world works.

The real world is also poorly represented in OO classes as taught in college. (Bear with me as this is relevant to the above.) I remember taking my very first Java class. It was taught by an instructor who seemed to know the language, but was a bit hazy on OO theory. However, this was her very first term teaching and I'll cut her some slack, but she had the problem that I see in many classes. She used examples that don't translate to the real world.

This instructor taught OO design by modeling nouns. A Dog isa Mammal isa Animal. A Firetruck isa Truck isa Vehicle. A BadExample isa DangerousExample. Of course, nouns can still be instructive as teaching tools. Showing why Programmer and Manager roles should not be subclasses of Employee is a great example of OO theory. The reality, though, is that good objects reflect abstract business requirements and as your business is more specialized, your objects are more specialized. Those classes will not be found on the CPAN. They are what truly represents your business value.

This isn't to say that the CPAN is worthless. Modules on the CPAN tend to represent more abstract concepts than Vehicle::Truck::Firetruck and, regardless of the language you use, if there is a large body of prewritten code available and it suits your needs, it may be worth reusing it. However, many of the core features necessary are available in other languages. Today, if we were to rewrite our code from scratch, I think we would reach for Ruby.

Now here's my "ace in the hole" argument: Our Perl is not what you think. Sure, it's easy for me to sit here and say Perl's CPAN isn't the end all and be all when we use it so frequently. However, most (not all) things that we use from the CPAN are either readily available in other languages are are easily cobbled together. The heavy lifting -- those things that are the core value in our code -- has all been built in-house for our needs. Case in point:

# shown here with permission but still munged to hide details
my $dataset = Dataset->new(
    [ qw/rev state title_name xtns polled/ ],
    [ qw/state/ ],
    [ qw/rev/ ],
    $request
);

That little snippet builds a 30 line SQL statement that only shows data the user is allowed to see and creates a dataset that, with a one-line command in our templates, builds a large, paged, sortable report. We can effectively treat our entire database as if everything came from a single table. We just ask for the information we want and how to group and sort it. There is no need to specify the tables we are pulling from or how to join them together. There is also no reason why this needed to be written in Perl.

Of course, we do use an OORDBMS mapper. We wrote our own because the ones that existed when we created ours were in their infancy.

We also use a home-grown xUnit style test suite because Test::Class didn't exist when we wrote our version.

Despite our inventing or reinventing many wheels, we're destroying our only serious competitor in the department I work in. Why? They can't match our productivity. There are actually few companies in the world that can do the specialized data management that we do and that's in large part because of choosing a dynamic language, not because we used Perl.

This doesn't mean that Perl is a bad choice, either. Perl is fantastic, but if you want it to scale, you really need a stronger than average programmer and those can be tough to find. Just because Sally Admin or Joe Web Designer can cobble together a few scripts does not mean that they are qualified to build big systems. Most Perl programmers, in fact, are decidedly not qualified to work on those systems due to the dangers inherent in them. That's why Java has been so successful. Sally Admin may not be qualified to work on the Java system either, but, as she learns, she's less likely to wreak havoc on the system.

This, incidentally, is why I'm bullish on Perl 6. It appears to solve most of my Perl 5 problems. Tim Bunce has already mentioned that he plans to port DBI.pm to Perl 6, a Perl 6 CGI.pm won't be far behind (it might even be first) and I suspect that a few other core tools will quickly get ported. With those in place, and given that the real value in a business is the in-house code, if Perl 6 is stable and runs as fast as we suspect, maybe I won't be eyeing Ruby with such longing.


Why wait?

djberg96 on 2004-11-18T01:42:46

This, incidentally, is why I'm bullish on Perl 6. It appears to solve most of my Perl 5 problems. Tim Bunce has already mentioned that he plans to port DBI.pm to Perl 6, a Perl 6 CGI.pm won't be far behind (it might even be first) and I suspect that a few other core tools will quickly get ported. With those in place, and given that the real value in a business is the in-house code, if Perl 6 is stable and runs as fast as we suspect, maybe I won't be eyeing Ruby with such longing.

What are you expecting to get out of Perl 6 that you won't get from Ruby, other than a speed advantage? And even if Perl 6 comes out in a reasonable time frame (a big if), which syntax would you prefer? Besides, wouldn't it be nice to play with a decent thread, semaphore and mutex model *today*?

As for ORM's, I think the RAA is already ahead of CPAN in this department with ActiveRecord (part of the Ruby on Rails framework, which I recommend taking a look at), Kansas, Cerise, etc.

Last but not least, Rite (Ruby 2.0) is in the works. :)

Re:Why wait?

Ovid on 2004-11-18T04:09:28

I can certainly work with Ruby more, but only on a side basis. For day to day work, I'm still stuck with Perl. And still happy with Perl, to be fair. It's a great language.

The answer should be obvious

btilly on 2004-11-19T17:09:23

What are you expecting to get out of Perl 6 that you won't get from Ruby, other than a speed advantage?

An upgrade path. (Not that a speed advantage is a bad thing with the amount of traffic we have...)

Seriously where I work we have about 3 years of Perl code accumulated. Suppose that Ruby is 30% more productive than Perl. Then (by a naive calculation) if we switched, it would be 10 years before we had caught up and passed where we are now. That's not even counting the mess of going through a period with 2 versions of the same software. Making decisions with a 10 year time horizon to pay-off is risky, because a lot can change in that time.

Of course that is an unrealistic calculation. Because in 10 years both Rite and Perl 6 will come out. So let's assume that both Rite and Perl 6 are 50% more productive than Perl 5, that Ponie (Perl on Parrot) becomes usable in 2 years, Rite is released in 2 years, and Perl 6 comes out in 5 years. (It is supposed to be a year out, but I never believe those estimates.)

Now let's compare switching with remaining with Perl. Yes, I know how silly the numbers I'll produce are, but they capture something important. At year 0 we have 3 years of Perl development, vs 0 years of Ruby. In 2 years we'll have 5 years of Perl development vs 2 years of Ruby, which is like 2.6 of Perl. Then the Ruby track switches to Rite, and Perl to Ponie (no development improvement. 3 years later, 5 years from now, we have 8 years of Perl development vs another 4.5 equivalent on the Ruby track, add that to your 2.6 and we have 7.1 equivalent years on Ruby. Now the Perl folks switch to Perl 6 and Perl maintains about a year headstart. Plus with Perl there never was significant business disruption.

What these made-up numbers capture is the logic which causes existing projects stick with an alternative even if they think that that alternative is inferior. Even if the other technology is superior in the long run, the difference has to be phenomenal for you not to get killed in the short run.

And even if Perl 6 comes out in a reasonable time frame (a big if), which syntax would you prefer?

Perl 6 is now being estimated at a year away. I don't believe those estimates. But as the above indicates, even a promise that there will be a good alternative with a smooth upgrade path 5 years out is enough to clinch sticking with Perl if you have enough of an installed base.

Of course if you are starting from scratch, then choosing to go with Perl because of a promise of Perl 6 later would be silly.

Besides, wouldn't it be nice to play with a decent thread, semaphore and mutex model *today*?

This is your worst argument. Ruby does not have a decent threading model. The only thing that might make me consider going multi-threading is to continue processing when doing a slow blocking operation like a database call. Ruby implements cooperative multi-threading internally, and so doesn't offer that.

Cheers,
Ben