Tuesday, January 6, 2009

Outliers - II

Later on in the book, Gladwell examines airplane crashes - and how they happen. Listen to this:

In a typical crash, for example, the weather is poor - not terrible, necessarily, but bad enough that the pilot feels a little more stressed than usual. In an overwhelming number of crashes, the plane is behind schedule, so the pilots are hurrying. In 52 percent of crashes, the pilot at the time of the accident has been awake for twelve hours or more, meaning that he is tired and not think thinking sharply. And 44 percent of the time, the two pilots have never flow together before, so they're not comfortable with each other. Then the errors start - and it's not just one error. The typical accident involves seven consecutive human errors. One of the pilots does something wrong that is not by itself a problem. Then one of them makes another error on top of that, which combined with the first error still does not amount to a catastrophe. But then we have a third error on top of that, and then another and another and another and another, and it is the combination of all these errors that lead to disaster.

Does any of that sound familiar? The project starts out in a negative climate - perhaps the stakeholders each have a different agenda. The project is late, so the technical team is hurrying. They aren't checking each other. They are working overtime, so they are tired. They begin to make mistakes - each, individually, won't kill the project, but when you add them up, they mean that when the code is delivered to test (or worse, to the customer), nothing works.

Now, in the past decade or so, a lot of shops have improved quality to the point that the story about is the exception - or even the stuff of legends. Hope springs eternal. Still, it gives a very strong strong argument for pair programming: After all, you wouldn't fly in a jet airplane without a co-pilot in the cockpit - why would you allow your business-critical applications to be developed without one?

More Gladwell:

These seven errors, furthermore, are rarely problems of knowledge or flying skil. It's not that the pilot has to negotiate some critical technical maneuver and fails. The kinds of errors that cause plane crashes are invariably errors of teamwork and communication. One pilot knows something important and somehow doesn't tell the other pilot. One pilot does something wrng, and the other pilot doesn't catch the error. A trick situation needs to be resolved through a complex series of steps - and somehow the pilots fail to coordinate or miss one of them.

We have the exact same problems in software development.

In my experience, on software projects, the problems are rarely technical. Instead, they are communications problems. The right people might know the right things, but fail to communicate it to the implementors, the architects, the testers, or the deployers. Somewhere in the mix, key elements get lost, forgotten, and lead to delivering software that doesn't meet customer needs, is buggy, late ... or possibly, all three.

Gladwell comes up with a few reasons that this happens; we'll talk about that tomorrow.

No comments:

Post a Comment