Corehackers Project: Thoughts on Process

kid51 on 2009-07-25T18:28:27

So I'm sitting in a conference hall at YAPC on either the second or third day of the conference. A veteran member of the Perl community makes a presentation in one of the short time slots about a new project in which people can help improve Perl and its ecosystem. The presenter defines a body of existing code which would benefit from study by fresh eyeballs. Such study would lead to refactoring and improvements in testing and documentation, making the code more useful and maintainable over the long run. Participants in this project would be seen as assisting the current authors/maintainers of the targeted codebases. Those authors/maintainers would retain final say as to what contributions from the new project would actually get applied. The presenter is one of the relative handful of people in the community who can actually inspire others to devote their free time to new open source development projects. I sign up.

I could be talking about the Perl 5 Corehackers project introduced by Chip Salzenburg at YAPC::10 in Pittsburgh last month -- but I'm not. More precisely, I'm not talking about the Corehackers project *yet*. I'm talking about the Phalanx Project.

The Phalanx Project was an idea first introduced by Andy Lester in 2003. In the Phalanx project, groups of Perl developers would select frequently used -- but non-core -- CPAN distributions and, with a green light from the authors or current maintainers of those distributions, refactor them, write more tests for them and improve their documentation. At first few people knew of the Phalanx Project, but Andy retooled it somewhat and presented it at YAPC::NA::2004 in Buffalo, which is where I first heard about it.

By the summer of 2004 I had been the lead organizer of Perl Seminar New York for four years, had begun contributing distributions to Text-Template and HTML-Template. We submitted our work to the authors of each distribution. Marc and I presented on the Phalanx project at YAPC::NA::2005 in Toronto.

And then, silence. Despite repeated nudging, both by email and in person, we got no feedback on our work from the distribution authors -- for years. Finally, in 2007, quite a bit of our code was incorporated into one of the distributions and we got a commendation from the distribution's author. On one of the few occasions when I ever met that distribution's author in person, he commented, "Don't break it. It's my baby." (It was, I believe, his first CPAN distribution.) As for the other distribution: We never got feedback from its author and it appears none of our work was incorporated into the one CPAN release which that distribution has had in the past four years.

The members of the New York City part of the Phalanx Project learned a tremendous amount about careful preparation and testing of libraries in the process, but only part of our work actually made it out to the larger Perl community. My impression is that that was the outcome in the one other locality in which a group of people came together to phalanx a CPAN distribution. So the Phalanx Project was only modestly successful.

I tell this story now because there are many similarities between the Phalanx Project and the Perl 5 Corehackers Project. Fast forward from YAPC::NA::2004 at the University at Buffalo to YAPC::10 (NA::2009) at Carnegie-Mellon University in Pittsburgh. For Andy Lester as inspiring presenter, see Chip Salzenburg. For target codebase, see Perl 5 core instead of prominent CPAN distributions. For CPAN distribution authors, substitute the Perl 5 Porters.

But the thrust of the two projects is remarkably similar: Gather both veterans and, especially, newcomers -- people who may previously have never thought themselves qualified to work on the targeted code base -- to hack on important parts of the Perl 5 ecosystem and generate patches that the 'owners' of the code -- those who have commit bits -- would, it is hoped, apply to the code base.

And there's the rub. You could work your butt off refactoring some part of the Perl 5 core distribution, have all the tests you need to demonstrate that you haven't broken anything and have written clear, helpful documentation -- and still not get any feedback from the people who have the final say on the work you've done. You don't have a commit bit.

So while I was very glad to hear Chip's presentation about the Corehackers Project -- and am even meeting with David Golden this week to discuss how I might participate -- I must confess to some trepidation about how the relationship between the Perl 5 Porters and the Corehackers Project will evolve. My guess is that those who choose to participate in the Corehackers will develop considerable psychological investment in their work. And it's evident from the recent controversies over the Perl 5.10 release process that the Perl 5 Porters have considerable psychological investment in their work and the ways they have evolved to conduct that work. I could even see Porters saying, "Why do we even need a Corehackers project? Nothing's keeping you from submitting patches to p5p today."

So the Corehackers Project, to be successful, must face some important questions, both on its own and in conjunction with Perl 5 porters. Among these questions, at least two occur to me right off the bat:

  • How will feedback be provided?
  • How will differences between corehackers and porters be resolved?


key difference -- more committers

dagolden on 2009-07-25T23:06:46

One key difference between the Phalanx project and Corehackers is that a module usually has a single author or at most a small coterie of co-maintainers. So getting contributions accepted requires getting their attention and support and a busy/disinterested author can intentionally or unintentionally drop a contribution on the floor.

On the other hand, the Perl 5 core has over 20 people with a commit bit, so the odds of getting someone's attention and support is higher. I still think it's advisable to discuss a potential contribution in advance, but the risk that volunteer work will be wasted through neglect is much lower.

-- dagolden

"refactor them, write more tests for them..."

educated_foo on 2009-07-26T00:16:41

"... and improve their documentation."

In other words, some group decided to take someone's module, rearrange its code, and write some documentation for it, all without adding any features. It's hardly surprising that the author was reluctant to incorporate the changes. If "core hackers" accomplishes anything, it will be because it writes code that solves actual problems.

git, linux

ChrisDolan on 2009-07-26T02:52:27

One difference is that Perl is now in Git. To a small degree, you don't need a commit bit. Of course, nobody wants Perl to fork, but the work is less likely to languish if it's valuable. Plus, one gatekeeper (for CPAN a module) vs. a collaboration of gatekeepers (P5P) can make a difference.

Someday I hope to see Perl development (whether Perl 5 or Perl 6) become more like the Linux kernel development. There, it's impossible for Linus to be expert on everything (but it is amazing how close he gets!) so he delegates patch review to lieutenants who pre-approve work that Linus pulls, thereby letting him scale better.

If the corehackers project eventually organizes itself as a tree of developers with P5P at the root, then I can imagine it revolutionizing Perl development. And if you think Perl has compatibility issues, look at the Linux kernel by contrast!

Re:git, linux

Aristotle on 2009-09-08T00:50:36

I think the shared ownership is actually the key difference here. With modules, generally the original author retains an attachment to his code (unless they have so many modules that they can’t afford to care about any one of them in isolation, cf. Adam Kennedy). There is a resistance to accepting sweeping changes by an outsider.

With a project like the perl core, this is much less of an issue. It’s not anyone’s personal baby.

But that was the point of the Phalanx project

kid51 on 2009-07-26T03:19:44

"...without adding any features. It's hardly surprising that the author was reluctant to incorporate the changes."

But providing new features was not the point of the Phalanx Project. The point was to make the existing code better, thereby laying the basis for the authors to better add new features in the future. And that was what the module authors signed off on before we began our work.

The lesson I draw from that is that there has to be real buy-in on the part of the owners of code. That means that we should make sure the Perl 5 Porters are really on board with the Corehackers Project.

kid51

Is the phalanx project dead?

Mithaldu on 2009-07-26T18:56:19

I heard of it for the first time today, went to check out their irc channel and it's not even in use. If they're still alive, what's their main communication venue?

Re:Is the phalanx project dead?

kid51 on 2009-07-26T22:21:00

...went to check out their irc channel and it's not even in use.

I never knew we had an IRC channel! Our New York group -- which was probably the most active grouplet -- never used it.

Of course, there is absolutely nothing preventing you from embarking on the type of software improvement we were aiming for in the Phalanx project.

In the past few years, what I have done myself and have recommended others do as well is to identify CPAN distributions that look like they're not being actively maintained, contact the distro's current maintainer, and see if you can be designated as a co-maintainer. For some hints on how that works, see this talk I gave at YAPC::NA::2006 in Chicago. If you do that, I can give you pointers on how to proceed.

Re:Is the phalanx project dead?

Mithaldu on 2009-07-26T23:52:08

Thanks for the quick answer. :)

Well, there's two problems for me here. For one I live out in the boondocks, with the nearest PM group being Berlin, a full 300km away from me. For the other, I'm not exactly a people person, meaning that I'd rather spend time coding than trying to convince people to let me code for them.

Thus I was hoping that there was actually some kind of online organization associated with that, especially since refactoring code is actually one of my favourite activities.

Although, thinking about it, having some kind of contact person with CPAN expertise whom i can actually talk about with in a realtime context would probably help, as i actually do have a module I have my eye on. (Text::Iconv needs some changes to work on Windows.)

Re:Is the phalanx project dead?

Mithaldu on 2009-07-26T23:55:01

Ugh, apologies for the wall-of-text. Didn't know the software swallows linefeeds. (Was using basic mode.)

Re:Is the phalanx project dead?

kid51 on 2009-07-27T02:06:09

I live out in the boondocks, with the nearest PM group being Berlin, a full 300km away from me.

Then make it a point to include something like the German Perl Workshop in your schedule. (I'm assuming that's Berlin, Germany, rather than, say, Berlin, New Hampshire.)

For the other, I'm not exactly a people person, ...

... which describes most geeks. And which also explains why F2F opportunities ranging from local Perlmongers meetings to hackathons to workshops to YAPCs play such an important role in our community.

Thus I was hoping that there was actually some kind of online organization associated with that, especially since refactoring code is actually one of my favourite activities.

Well, if the Corehackers Project actually gets off the ground, that will fit the bill.

But note that I learned about both the Phalanx Project and the Corehackers Project not online but in face-to-face encounters with other Perl programmers. Even if 90% of the actual work of each such project was/will be conducted by hackers working alone in their homes, the energy of such projects is crucially dependent on periodic F2F contact with your peers.

Competition

zby on 2009-09-12T16:53:21

That Phalanx story is very interesting on its own. The outcome is compatible the human nature and especially the nature of authors, but still the it is interesting to see how strong these sentiments can be. But I think we have to find a way to break it if we plan to move forward with CPAN and it's growing complexity. One idea about it is to use the growing competition between CPAN modules doing similar things. Sure programmers would be much more confident to use modules that have been so scrutinized - so their user base should grow and feedback to the module.