The Prime Mistake by Bruce Nielson

In a previous post I talked about the blame game. I suggested that “the blame game” is a necessary part of software failures so it shouldn't be treated with as much fear and loathing it usually receives. By understanding the human need to pin down blame (and the general inability for human's to be able to) we are avoiding ignoring this very human success or failure factor.

Related to “the Blame Game” is what I call “the Prime Mistake.” And unfortunately, The Prime Mistake, is very bad for software developers, so it’s best we understand the concept.

The perfect example of “the Prime Mistake” happened to me years ago back in high school, in my Geometry class. My teacher had a bad habit of making all the problems dependent on the answer of a previous problem. Worse yet, he didn’t give partial credit for having done the follow on problem the right way but with the wrong starting value.

On one test I took, I missed the very first problem due to a stupid error. The rest of the problems used that value as their starting point, so I basically missed everything.

That first problem that I missed is an example of “the Prime Mistake” because it’s the original mistake that causes all the follow on mistakes.

Part of the Blame Game is the attempt to track down the Prime Mistake so that we can pin everyone else’s mistakes on one person.

Developers and The Prime Mistake

When it comes to developers, the Prime Mistake seems to be particularly dangerous. As it turns out, absolutely all problems in software can be traced (in some way) back to the programmer, usually with a pretty obvious cause and effect – a mistake on the programmers part. Woe is the software programmer!

Due to the prime mistake, the programmer may end up with basically everything at their feet. After all, every single defect in the code is – in a very real sense – their fault.

The problem with this line of thought is that removing all defects is beyond human capacity.

Worse yet, we tend to be far more forgiving of “mistakes” in principle than in practice. Ask any user or manager or user upfront if they accept that programmers, no matter how hard they try, will make mistakes. “Sure, they say. It’s inevitable.”

But then watch those same managers and users after an actual bug lead to a problem that causes a real dollar loss! Heaven help the programmer then.

The problem is that when a real defect is uncovered, it doesn’t come in a “theoretical package” it comes with real damage, such as loss of dollars, confidence, or prestige. This is why we react differently to real defects than theoretical defects.

To make things worse, real defect are shortly followed with “a fix” which is really just a fancy way of saying “all information about how it could have been avoided if the programmer has just been ‘a bit more careful or smarter.’”

So the temptation to assign the Prime Mistake to the developer may become overwhelming for “real defects.”

Spreading the Blame Means Spreading the Responsibility

In our more theoretical moments, we know that programmers can’t possibly have no defects. It is this realization that has led to the industry standard of having “testers” that are separate from programmers. Indeed, the invention of “testers” is a very good step in the right direction. We effectively “spread the blame” out a bit. If a defect gets past both the developer and the tester, we might feel a bit more forgiving to the developer for having “screwed it up.”

(Another point is that a tester hopefully helps find the defect before there is a loss attached to it. But this is a tale for another time.)

But we shouldn’t stop here. Software is so complex that even a good developer and a good tester together can’t realistically “not make errors.” Everyone involved in the project must take responsibility for avoiding defects. The users must test regularly and give clear requirements, the programmer must use automated unit tests, and the testers must have test plans so that they aren’t relying on memory. Removing defects in software is never one person’s (or one group’s) responsibility.

And when all of that together fails, at least everyone knows we all did the best we could.

Published Tuesday, January 05, 2010 2:00 AM by BruceNielson