Moose, Padre, and why the burden of proof is on the cultists

Alias on 2009-09-03T16:24:17

It appears I accidentally kicked off the next round of the Moose performance flamewars. And this time it's dragged Padre in as well.

It's funny how in any significantly large battle of words, it's the zealots on the outside that make the most noise. The people who actually run the projects involved and how the most knowledge about the situation are generally the most calm about the situation.

Stevan did bring up one good point I hadn't noticed before though, and that is that Moose is rather "OH MY GOD MY RAM!!!" large in size.

Last time I looked at Moose, it was more like 3-4meg in size. That's fairly big, but falls into the same size range as CGI.pm, CSS.pm, DateTime, Template Toolkit, and a number of other packages that comprehensively solve a fairly complex topic (and mostly avoid getting too bloated). These are also the kinds of modules that make the best targets for ::Tiny modules. But I digress.

Moose has changed over time, mostly improving, and in startup terms at least keeping to a reasonable budget as it grows in size.

The last time I took more than a cursory look at it, in November 2007", is was bloaty, slowish, slow to start, but very correct and convenient for writing big persistent applications if you didn't mind the speed hit.

As mst asks me to remind people, my previous assessment is now wrong. Moose isn't slow, although the rest are still true. But the Moose team have got their priorities absolutely in the right place.

You absolutely have to prioritise important improvements for your core demographic first. And since the core demographic for Moose is big, complex, persistent applications, it absolutely makes sense to get Moose fast first.

If I rerun the my old Object::Tiny vs Moose::Tiny benchmarks, I get the following for a simple 5-accessor class.



# Object::Tiny vs Object::Tiny::XS vs Moose::Tiny
# Note that I had to do __PACKAGE__->meta->make_immutable to 
# the Moose::Tiny class, because it doesn't to so itself.

# Based on the following Object::Tiny class.
use Object::Tiny qw{
    one
    two
    three
    four
    five
};

D:\DOCUME~1\ADAM~1.KEN\Desktop>perl moosemark.pl
                Rate moose_all    xs_all tiny_all moose_none tiny_none   xs_none
moose_all    82795/s        --      -54%     -54%       -72%      -90%      -92%
xs_all      178763/s      116%        --      -1%       -39%      -79%      -83%
tiny_all    179759/s      117%        1%       --       -39%      -79%      -83%
moose_none  294985/s      256%       65%      64%         --      -65%      -72%
tiny_none   842460/s      918%      371%     369%       186%        --      -21%
xs_none    1066098/s     1188%      496%     493%       261%       27%        --
            (warning: too few iterations for a reliable count)
               Rate moose_get  tiny_get    xs_get
moose_get 1730104/s        --      -11%      -32%
tiny_get  1937984/s       12%        --      -24%
xs_get    2557545/s       48%       32%        --

The first set are constructors (the "none" set having no params to the new constructor, and the "all" case having all of them) and the second set are accessors.

The speed is still slow, compared to a trivial hash constructor that doesn't do any param checking, but given that Moose::Tiny is doing work here that Object::Tiny isn't, clearly Moose is fast enough for regular use.

Of course, it's not as fast as the limit-pushing Class::XSAccessor-based Object::Tiny::XS, but it has constructors that are quite fast enough, and accessors (or at least, readonly ones) that are faster than raw native Perl accessors.

Of course, since Padre already uses Class::XSAccessor extensively, converting to Moose does mean that Padre gets slower (in absolute terms anyways). But it's really not that big a difference. It's enough to take any "Moose would be faster" argument off the table, but not much more than that.

But it's clear from these numbers, that if you have a large persistent application, especially one in corporate-land where problems with people cheating and doing weird stuff is endemic and the extra strictness helps a lot, switching to Moose is almost a given at this point (unless you have some weird needs).

And indeed, I have that kind of thing at work. And indeed I have started the process of (sloooooowly) switching us over to Moose for our main 250k SLOC codebase. I even gave a quick tutorial a few weeks ago for the development team.

So I hope this clarifies that I'm not anti-Moose. I'm just pro-"the right tool for the right job".

Which brings us to Padre. Padre is not (yet) a big bloated application.

Maybe it's inevitable that it's going to eventually become one. But if we're going to build a great IDE with lots of features then it's quite likely that it is going to trend in that direction.

That doesn't mean you EMBRACE being big, bloated, and slow. Nobody WANTS to be multi-minute startup monstrosity like Emacs. It's not some right of passage for IDEs. It's something you fight.

If you know you are going to have bloating tendencies, you work HARDER to fight the bloat everywhere you can. Every time you add more of the weight that you know is ultimately your downfall, it's a judgement call. Is the functionality REALLY worth it? Is it something that every user will use every day, or something a tiny percentage of people will use once a year?

Is the weight something you can delay loading until runtime?

In the case of Moose, yes we a 0.3 second load penalty. And yes I like to shave time and RAM out of the default startup. But lets ignore that, because there's a bigger problem.

Moose would need to be loaded early, which is a problem. Especially with threading, which we use a lot.

Padre has a pre-threading base load cost (with no features and no open documents) of 35meg, and that jumps to 64meg once the task manager starts spawning off it's normal couple of threads. When we weren't paying attention and just adding features, about 10 releases ago, those costs were more like 55/85meg. By aggressively hunting down wasteful dependencies, and disabling code for dep-heavy features we weren't using yet, we managed to remove 25 dependencies and reduce that 55/85meg to 35/60meg.

If I do NOTHING else other than add "use Moose;" to Padre.pm, that jumps from 35/60meg to 40/78meg. And what do I get for that? Absolutely nothing, except a 0.3 second slower startup, a 0.3 second slower single-instance server spawn, and a 0.2ish second slower thread-spawning speed.

If I actually start to use Moose for real, replacing the 180 classes (and growing) with Moose code, that cost is almost certainly going to grow both in startup time and memory cost. And as a bonus, because we would be ditching Class::XSAccessor in the bargain, Padre would get non-trivially slower.

Now, this doesn't mean it's not worth doing necessarily.

But it means that up front, switching to Moose adds absolutely nothing of value to Padre, and comes with a high cost. Performance is what economists (and MIT algorithms professors) call a "fungible" good (like money and computation) because it's universally exchangeable.

Every 5meg you save in dependency reduction and load-delaying is 5meg you have to spend on features and usability and eye candy later. Do you think those pretty splash screens come for free? :) The fact that YOUR desktop monster at work has 3 gig of RAM and only cost $1000 dollars is only relevant if we want Padre to be something that can ONLY be run on something that large.

A number of people have also commented that perhaps if Padre moved to Moose, all these Moose folks would magically parachute into Padre and adopt it and start hacking on it, and we'd leech off their hype and their energy, and hey the train is moving and we'd better be on it.

All of this is entirely speculative. I've yet to hear anyone say that they'd just LOVE to hack on Padre, but because it doesn't use Moose they aren't going to yet (although after this post I'm sure I will in the comments). What I do hear is that people hate using the Wx APIs, and that they are horribly documented. And that is a whole different post :)

Most of the main reasons to move to Moose seem to be that it's easier to maintain, and it's just SO damned trendy right now. Everybody else is doing it, it MUST be good.

I hear that kind of talk, and I smell a cargo cult forming.

With clear evidence of the Moose downsides (for Padre), and no clear evidence of the upsides (again, for Padre), what this means is that the burden of proof is on the people that want the switch.

It's not going to be enough to just lobby for the change, they are going to need to actually take a branch, convert some non-trivial percentage of Padre's codebase (ideally the ones that are used in that default base load) and then demonstrate proof that the costs aren't as bad as we thought, and that the maintenance is SO greatly improved that it's worthwhile spending that huge amount of RAM.

If new ideas and new modules are truly going to be great, then just like in science you need to take the OPPOSITE approach to the New Shiny. You need to look at the New Shiny suspiciously and critically, and you find fault and pick holes in the idea until it does one of two things.

1. The idea collapses because of hidden flaws that would have hurt a lot more to discover later (as we are discovering with META.yml).

2. The attention to the flaws in the otherwise good idea attracts the mental and actual effort to those problems early, so they can be repaired or refactored away.

If you are lucky, and get the latter solution, then at the end of the day what you have isn't New Shiny at all. It's something that is just so obviously The Thing You Use To Do That, that the whole thing becomes a non-issue.

Moose may well eventually get to that point, it does have that smell of inevitability to it (much like git does). But clearly it's not there yet, it still has some big issues to address in order to expand it's range of uses out from big persistent things to things that aren't big persistent things.

As an aside, one of the funniest arguments I seem to hit when I complain that Moose isn't useful for simple command line scripts is to use App::Persistent.

Which is to say, I solve the problem of Moose mainly being ideal for big persistent applications, but turning my tiny command line script into a big persistent application :)

A second aside, is that while we don't use Moose in the main Padre distro (and we recommend against it for plugins) we DO use it for the Perl 6 plugin. Because in THAT case, it's worth it, because it means we get to have STD.pm.

The important thing here

hobbs on 2009-09-03T20:12:38

is that it doesn't really matter what you think. A considerable portion of the rest of the world will end up using Moose irrespective of your opinion of it. Will you accept it in your dependencies and spend your time working on something useful, or you dedicate the remainder of your life to shaving Yak::Tiny?

Just STFU

brian_d_foy on 2009-09-03T21:34:25

A considerable portion? Really? Most people I run into using Perl don't even know there's a CPAN.
And, isn't this the same tired argument about why we should be using whatever the hot new technolgy is? I've lost count how many times I've been told this about some Perl module or framework. Indeed, if that's really what you believe, why do you even use Perl? You should be using Java, since even more of the world uses that. And Windows too. And ..., and ..., and ...
Why do you care what Adam decides to use or not, or how he spends his time? At least he has a coherent argument. You merely whine like a fanboi without responding to any of the actual issues Adam raises and that Moose people acknowledge. Moose might be a great project, but the fucking fanbois like you are enough to keep me away (despite how cool the core Moose people are).
The important thing here is that Adam is making an actual economic analysis of what he gives up for what he gets, and making a reasoned decision based on it. His economic inputs might be different from yours, but that doesn't mean he's wrong (or that you are right).

Zealotry is bad Mmm'kay?

perigrin on 2009-09-04T02:05:24

You merely whine like a fanboi without responding to any of the actual issues Adam raises and that Moose people acknowledge. Moose might be a great project, but the fucking fanbois like you are enough to keep me away (despite how cool the core Moose people are).
This is just sadly ironic. You're saying you are letting the community decide for you what you should and shouldn't explore technically without taking the project itself into account. We have all had our fanboi moments. But, as a Moose developer, I will freely admit that telling someone "it doesn't matter what you think" isn't helpful, Even when people like stevan or I do it in jest.
It seems however that this whole thread has been a series of people telling other people that what they can and cannot do in a public forum. You shouldn't let over zealous comments keep you away from what I consider to be a wonderful project, just as Dave shouldn't let over zealous downstream developers decide what he can and cannot do for his own modules, and Adam and the Padre team should not let over zealous developers tell him how to run his project.

Re:Zealotry is bad Mmm'kay?

brian_d_foy on 2009-09-04T05:56:33

I think you don't understand irony. I'm not letting anyone choose for me. I'm big enough to make my own decisions.

Re:Zealotry is bad Mmm'kay?

perigrin on 2009-09-06T06:30:10

Irony (from the Ancient Greek εἰρωνεία eirōneía, meaning hypocrisy, deception, or feigned ignorance) is a literary or rhetorical device, in which there is an incongruity or discordance between what one says or does and what one means or what is generally understood.
The incongruity between what you said:
the fucking fanbois like you are enough to keep me away
and what you mean
I'm not letting anyone choose for me. I'm big enough to make my own decisions.
Seemed ironic. Especially in the context of a thread kicked off by Someone (Adam) telling someone else (Dave) what to do, and in response to someone (Hobbs) telling someone else that their opinion won't matter in the long run (Adam).
You and I have talked about Moose in the past, and discussed it's faults, I think you had valuable input. I'd hate to have people intentionally or unintentionally drive you away from Moose for reasons other than it's technical merits. Which was the entire point of my comments before, there was no offense intended.

Re:Zealotry is bad Mmm'kay?

zby on 2009-09-04T08:04:53
"It seems however that this whole thread has been a series of people telling other people that what they can and cannot do in a public forum" - that is a good characterisation of the circumstances - but I think we need to go a bit deeper here and explain why it is now that people started to do that (http://en.wikipedia.org/wiki/5_Whys - anyone?). Yes - CPAN contributors should be free to choose any way they want to code their own modules - but on the other hand those that add these modules as dependencies to their code should be secure that a new version of these dependencies will not render their code unusable. This situation is pretty much the same as with API deprecations and calls for a similar measure - explicite declarations from the module authors.

Re:The important thing here

sigzero on 2009-09-04T00:05:53

I wonder if something like the Badger framework (Andy Wardley) would give most of what you want with Moose but with better numbers. I have never seen them compared. The benefit I see is that Badger doesn't have outside dependencies so would be easier to "ship" with Padre.
Just a thought...

Re:The important thing here

perigrin on 2009-09-04T01:54:00

Last I looked at it Badger had a totally different set of goals than Moose. Badger was the distillation of what Andy Wardley had been using for building Perl applications while Moose is specifically about building a good MOP based Object Framework for Perl.
And playing devil's advocate for a second, what is the difference between adding one up stream module, and adding 12 that install relatively cleanly (both according to deps.cpantesters.com have a > 90% chance of installing clean on 5.10.0)? In both cases you've handed control over to an external developer, so the horses are already out fo the barn.

Re:The important thing here

chromatic on 2009-09-04T03:39:02

Modules you install for yourself are necessary and sufficient and good examples of code reuse.
Modules your dependencies install for you are bloat.
I'm not sure what happens if modules your dependencies need are modules 1) you already have installed for yourself or 2) modules you've written and distributed, but it's a good first approximation of a definition.

Re:The important thing here

Hercynium on 2009-09-04T22:48:02

Personally, I take a couple of things into account when I decide whether or not extra modules constitute "bloat".
First, how likely are these modules to fail to install on the target platform(s)? If they are going to make my code harder for the sysadmin (often myself) to move, port, and maintain in the future they may not be worth pulling in to my dependency chain.
Second, how does this module perform given how I will be using it in my code? How does it perform compared to it's alternatives?
Third, do I need any of these dependencies for anything else in the app? Are they good enough that I would use them instead of *their* alternatives? That's usually not a problem for me, but I *do* think about it.
I'm the type of programmer who will spend an extra hour or two writing code several ways, then selecting the one that I believe best reflects my overall needs. Yes, I throw out *lots* of code. I don't care.
The CPAN already provides me with excellent tools to gauge how using dist/module will affect my program's portability and to some extent maintainability. CPANTS also lets me get at a glance some useful info on kwalitee and that factors in as well.
The only thing not already handed on a silver platter to Perl developers are benchmarks - and I don't expect that to change any time soon! (would be nice for CP6AN though *hint hint*)

Why use Moose...

jjn1056 on 2009-09-04T14:39:03

Well, I use use for pretty much everything so my opinion is suspect. Most of the Pro 'Why Moose' bits have already be said so I will just add one. My gut feeling is that Moose based projects will have a much easier time growing and bringing new developers onboard. That's due to the fact Moose gives you most of what you need to develop well and after it's been in the wild for a few years we have developed a set of best practices. So Moose projects are easier to jump into. Non Moose projects have to incorporate a pile of bits just to be productive, such as something to speed up making accessors, validating parameters, dealing with inheritance ugliness, etc. Usually people end up rolling there own half baked, poorly written solutions, which a newcomer has to figure out before she can even start contributing. With Moose I look at the code and very quickly I know what is what. There are much fewer surprises to me.

That's my opinion and I only have anecdotal proof of it. For example, Catalyst pre Moose was being used a lot but development and new releases seemed basically dead for like 18 months. Then, we are on Moose and it's been regular releases and updates. Sure, part of that has been fixing up some of the backcompat issues, but there has been a lot of new things added and there seems to be to be a lot more excitement about the platform. I expect similar things to happen when eventually DBIx::Class moves onto Moose.

So, my feeling is that chooing Moose as a core tech makes it a lot easier to grow and manage a development community. That's why I would choose it.