Code Review and the Theory of Probability
Not all developers are familiar with the theory of probability. One would think that it’s not a problem. Every man to his trade – there are no ingenious Jacks-of-all-trades. A good knowledge of the theory of probability is needed perhaps in game development, cryptography and, possibly, in different sorts of financial and statistical software. But in fact, lack of understanding of some things can lead to bad results even in projects that seemingly do not imply the application of such things. There is no magic, the human brain simply misweighs some probabilities and, as a result, makes wrong decisions.
So, let’s imagine some developer called John. He is writing a code, well, because he is a developer. Suppose, John is a good developer and 75% of the code is free of errors. In fact, I’m lying, and John is hardly a guru, but let’s assume anyway. To make it simple, let’s assume that in each 100 lines of his code 75 need no corrections, and 25 need error detection and correction. And we decide how to do it. As always, there are a lot of ideas: someone preaches comprehensive unit tests, someone argues for the QA team extension, others vouch for Code Review for every commit; perfectionists that ignore the budget and deadlines vote for all the above mentioned, self-assured “aces” and penny-father managers vote against all. As always. But some resolution is needed. And here comes an idea to assess the resources for and benefit from each activity. And now, let’s estimate the efficiency of, say, Code Review.
Unsophisticated calculation may look as follows: John has written 100 lines of the code (he usually has around 75% of correct code and 25% of bugs). Make him spend the same amount of time on Code Review and we will get only half amount of bugs (12.5%). Well, doubled spending of resources should lead to half less bugs.
But yet – nothing of the sort! John is John. He wrote that code yesterday and if he performs his Code Review today, he won’t find anything.
OK, let’s assume further. If Pete (almost similarly qualified) reviews John’s code, he will reduce to half the number of bugs (down to 12.5%). Well, that’s because they have almost the same qualification, but still remain different people. Thus, this job consumes twice more resources; hence we have half of the bugs. Now, let’s look at what we have in reality.
The Theory of Probability has it that if a system consists of two devices with Х and У reliability, both devices must fail for it to malfunction. So, the total reliability makes 1 — (1-X)*(1-Y), i.e. for John and Pete “devices” with the same 75% reliability, the total reliability will make 93.75%! Look, the number of bugs has dropped not to 12.5%, but to 6.25%! So, spending the resources of a different person (even of the same qualification) for Code Review reduces the number of bugs not two times, but four times! Cool. Besides, considering that Code Review is usually performed by the best developers in the team (strictly speaking those with the reliability over “the average” 75%) who spend sufficiently less time on it than the author does on code writing, we get even more impressive results.
What was it all about?
Now, with better understanding of the possible result and investing some resources in Code Review it is it easier to decide on this or that mechanism for the code quality improvement. It is possible now to handle numbers and not the usual “everyone does it, so it’s right” against “to hell with it, there is no time anyway!” For a reasonable management an argument supported by figures will always weigh more than a simple beautiful wording.
A summary for those who are lazy to read the whole article
Application of Code Review (if you still fail to apply it) will bring more benefits than it seems at the beginning.