Interesting insights from Software Estimation

btilly on 2007-06-11T07:54:17

Note: This is a letter that I wrote to Steve McConnell after I read his most recent book. I'm publically sharing it because I learned something that may be of general interest from thinking about the topic, and I'd like to see more people read the book.

I finally got around to reading your most recent book. As always with your books, I found it fascinating. So fascinating that I'm writing you an open letter which I am going to publicly post elsewhere.

First of all, as I expected, you hit it out of the park. You've done a very good job on a difficult topic that our industry normally does a horrible job on. Very few of your readers have any clue how poorly they judge what is 90% likely. It is incredibly helpful to be conscious of how people confuse estimates, targets and commitments. You were absolutely right to back up your advice on how to create accurate estimates with advice on how to defend those estimates from organizational pressures to replace them with wishful thinking. And, as always, all of the advice is backed up by invaluable compiled (and meticulously referenced) data on everything from how uncertain the best possible estimates are at various stages in the software lifecycle to how wide the productivity variation is from company to company.

As with any book, it is not perfect. However the overall quality is extremely high and the remaining imperfections are small. Furthermore you took into account all of my criticisms for the one chapter that I reviewed. Since I had the opportunity to review the rest of the book and didn't, I feel that any oversights that I notice are more my fault than yours.

Needless to say I highly recommend this book to everyone involved in the software development process. And my main difficulty now is identifying who in my immediate environment I should lend it to first. (ie Who would create the biggest positive impact on the company I'm in.)

As is often the case with your work, of even higher value is how close reading leads to or reinforces insights on other parts of software development. Sometimes this is presented in an understated way. Such as the paragraph on page 64 that says, "...individual performance varies by a factor of 10 or more. Within any particular organization, however, your estimates probably won't need to account for that much variation because both top-tier and bottom-tier developers tend to migrate toward organizations that employ other people with similar skill levels." (A fact which you then provide 2 references to.) I laughed aloud at that one.

Sometimes the tangential gems are presented very directly. For a random example on page 69 you point out that multi-site development increases needed effort an average of 56%. As you say, this effect should be carefully considered by organizations considering outsourcing. And while most software professionals understand that this is a significant factor, very few of us can quantify it. Which makes it hard for us to get businesses to take it seriously.

And sometimes the insights are not directly presented. They are just implicit in the copious data that you've presented, waiting to reward the careful reader who can spot them. I'd like to talk about one of those.

It has long been a mantra among people who like dynamic languages that developers are more productive in small groups, and so there is great value in delivering languages that make small groups as productive as possible. I cannot count how many times I have seen variations on this theme, nor can count how many times I have personally repeated it. Supporting anecdotal evidence is easy to find. However until I read your book, I'd never seen concrete quantitative evidence that I could quote to support what is common knowledge in some circles.

Well I'd long known evidence for part of that assertion. Variations on the chart that you reproduce as table 5-3 on pages 64-65 have been circulating for ages. And while I agree with your conclusion that it is more productive to use a language such as Java instead of a language like C, I'd also point out that it is more productive to use a language such as Perl instead of a language such as Java. Interpolating from that chart with too much precision, about 2.4 times. I hadn't before seen the more detailed table 18-3 that you offer on page 202. Judging from that, average Java programmers need 2.75 times as much code as average Perl programmers to do the same task. Those estimates agree since neither is very precise - Java takes somewhere between 2 and 3 times as much work for the same task as Perl.

Of course coding is but one of the tasks that needs to happen in software development. If only half of your development time would have gone to coding (a reasonable estimate based on table 21-4 on page 236), then reducing coding time to 40% of what it was only saves you 30% of overall effort. Still that is a significant reduction. Why don't people pay more attention to it?

The catch is, of course, that Java has many features that make it much better than Perl for handling the challenges of development in large teams. Therefore it is easy to dismiss the productivity benefit because "Perl is not scaleable." And it is easy to likewise dismiss the anecdotal accounts of exactly how productive small teams are because common sense keeps us from accepting that 6 people do more than a dozen.

Which is part of the reason why I am grateful to you for reproducing figure 20-3 on page 229. I've heard estimates before that it takes a team of about 20 people to match the output of a team of 6-7, but I'd never before seen concrete data backing that up.

Anecdotaly the primary cause is well-understood: people are most effective in a flat team, but that only works for teams up to about 6-8 people. With that structure you have little to no overhead from having to manage process, or from people not being able to find out what they need to know when they need to know it. But that falls apart when there are too many lines of communications. The solution to that problem is to introduce process to cut down who needs to talk to whom, when. However adding process drops productivity per person significantly, meaning you have to add more people. And this cascades until you get to the same productivity with a far larger team. But then you can scale for a lot longer, but at far higher cost.

There are secondary issues that are also well understood. For instance you're likely to find a higher portion of good developers in the small team environment? Why? Well there are a lot of reasons. First of all it is clear that it is easier for an individual to be productive in the small team than in the large one. People who are drawn to productive environments are likely to be people who value their personal productivity, who are therefore likely to be productive people. Conversely it is much harder for an incompetent developer to hide in a small group than a large one, so the worst developers don't stay. Additionally, given comparable turnover rates, one can maintain staffing levels in a small group while being more selective about candidates than one can in a large group. And finally a company that understands the cost benefits of having a small group of good people can justify higher individual salaries for those people.

So the 3-1 individual productivity difference in lines of code between small teams and large teams has a number of causes. It really isn't as simple as saying, "Move 2/3 of your 20 person team away and you'll get the same productivity." However that said, the line of code measure may be hiding some more dramatic productivity differences.

Some are very hard to quantify. For example common sense tells us that a team of 6 people that all talk to each other is going to have more consistency across 57,000 lines of code than a team of 20 people who are deliberately being kept from talking all the time. That lack of consistency is going to show up in all sorts of bad ways, from re-invented wheels to misunderstood internal APIs.

But one is easy to quantify: the small team is much more likely to be using a productive interpreted language than the large one. So the 57,000 line project delivered by the 7 person team might well have 2-3 times the functionality of the 57,000 line project delivered by a 20 person team in about the same time. (As I've noted, the productivity difference comes from a combination of factors, including having better people.) Even if you're paying those programmers 50% more per person, your productivity per dollar is about 5 times better with the small team than the large one. That's a pretty dramatic difference. While I'll be the first to admit that there are limits to what small teams of good people can do, I'll also stand in line to point out that those limits are farther out than most people realize, and there is a very good business case for relying on small teams whenever you can.

Anyways, congratulations on yet another excellent book, and I'm sure that I'll be digesting its consequences for a long time to come.

Cheers,
Ben