Gunnar Kudrjavets

Paranoia is a virtue

How to Shoot Yourself in the Foot with Code Coverage? (Part I)

Prologue

Usually after every milestone or after every version of some product most of the groups have something called postmortem. Encarta Dictionary defines postmortem as "an analysis carried out shortly after the conclusion of an event, especially an unsuccessful one." This doesn’t of course mean that every time you have postmortem, something went wrong. Quite the opposite, analyzing the positive experiences and writing down keys to success have the same benefits as learning lessons from negative events. Most people have unconsciously their personal postmortems daily thinking about recent events and stuff ;-) This is my code coverage postmortem.

First of all, here are the things the current post isn’t about:

  • It’s not about explaining what code coverage is. I’ll make an assumption that you know the basic principles. There are plenty of articles written about code coverage, most software engineering textbooks touch the subject in one way or another, and there are a number of tools in the market also.
  • It’s not about how to use code coverage properly. Let’s be honest, I’m in the middle of working on the efficient process myself. And of course it’s always easier to criticize (even myself) than tell how to do something right. What’s this about? It’s about the experiences I and my team had with the code coverage during last year or so. The best paper written about misusing code coverage I’ve found so far is located here. It’s written by Brian Marick and BTW he has lots of other interesting things in his blog which are IMHO definitely worth reading.

First I’ll cover the major incorrect acts or decisions which were made and in the second par of this post I’ll mention a number of things I plan to improve during the next version of our product. So, hopefully one year from I can write about the smart decisions made and excellent results achieved ;-) Curious reader may also wonder what code coverage tools we use to get the job done. We used toolset called "Magellan". Microsoft Research web site the previous link points to has somewhat more information.

Mistake 1: underestimating the complexity of code coverage itself

When I started to use any code coverage tools then the first thing I did was to spend significant amount of time while reading code coverage related articles in ACM Digital Library and I have to admit that I still don’t feel very confident talking about code coverage. There are number of things: function coverage, block coverage, statement coverage, path coverage, condition coverage etc. To efficiently use any of this a time needs to be invested in training people, choosing proper metrics, explaining metrics to all team members, and picking key things to measure. If I would have to start from scratch I would probably plan at least 1-1.5 days for in-depth workshop for the entire team to go through all the terms, their usage, and applicability to our particular situation.

Mistake 2: falling into the general "bigger numbers = good testing quality" trap

Whenever I read articles where somebody recommends that one should strive for 100% code coverage I want to start screaming ;-) This statement is IMHO in the equivalence class with stating that absolutely all the bugs in the product need to be fixed. What tends to happen IRL is that there’ll number of team: teams A, B, and C and we’ll start measuring how much code coverage these team have for the components they own. What happens next is very natural to human nature - people tend to optimize their work based on how it gets measured and the competition starts kicking in.

I usually give two examples how measuring the level of component testing is absolutely not relevant to how much code coverage you have. First example: assume that I’m one of these people who like to write their own versions of strlen(). My personal rationale is that I want an assertion to be fired when NULL pointer is passed to the function as a parameter. My code may look like this:

UINT YetAnotherStrLen(LPCWSTR szSomething)
{
    _ASSERTE(0 != szSomething);

    UINT nCount = 0;

    while (L'\0' != *szSomething++)
    {
        nCount++;
    }

    return nCount % 42;
}

Let's assume that somebody will write a quick unit test calling this function with parameter "Foo". The result checks out and nobody may even notice that programmer who wrote this routine made a mistake of returning all the results modulo 42. If somebody will measure how much code will be covered by this single unit test then there’s a pretty good probability that your favorite code coverage tool will report 100% of lines covered. Can we make any decision based on that? Of course not. You should use the most powerful weapon you have - common sense. The main point I’m trying to make is that like any other thing the code coverage can be very easily misused.

One of the most senior people in our development organization told me once story about his previous team: they were writing some number of tests to be run before any addition to code base (we call them check-in tests, your mileage may vary) and one of the developers was given a task to write a test which would have execution time less than 15 minutes and will cover as much code in product as possible. It took this person about a day and he achieved ~80% of block coverage. They did this to have something which will make sure that most of the code is executed and during the execution nothing bad will happen, but nobody ever stated that this test will prove that most of the features work as they’re supposed to. This is an example of doing something and understanding what's done and why it's bad or good.

Mistake 3: trying to achieve more than 80% of code coverage

Based on my experience getting more than 80% of code coverage gets extremely tricky and there’s very little in return for your investments. There’s an urban legend in circulation about one of the software companies where developers were pushed to achieve some unrealistic code coverage number and they ended up removing error handling code…

One of the other examples I like to give to people is analyzing the cost vs. benefit when trying to write some amount of unit tests to get all the error handling cases covered. Let’s look at hypothetical situation: you have a service, something bad happens (let's assume that service code contains some amount of AI and it detects that security has been compromised ;-)) and you’ll need to stop the service, if stopping the service fails then you would like to log the reason for a failure to NT Event log. The pseudo-code may look like this:

...

if (fSomethingBadHappened) {
    // Get the handle to the service.
    SC_HANDLE hService = ::OpenService(...);

    if (NULL != hService) {
        // Go ahead and stop the service.
        ...
    } else {
        // Try to open an event log.
        HANDLE hLog = ::OpenEventLog(...);

        if (NULL != hLog) {
            // Go ahead and log the reason for failure.
            ...
        } else {
            // We can't even open the NT Event log.
            // Do something else.
            ...
        }
    }
}

...

Let’s say that we need to write an automated test case for making sure that code in the last error case (we can’t open an event log) is executed. What do we need to do? First of all we need to simulate this condition to make "something bad to happen", then we need to make somehow OpenService() fail and then we need to make OpenEventLog() to fail. Usually this can be done using technique called fault injection. How much time it’ll take? Well at least one day if not more: writing the code, testing it, asking somebody else to review the code, checking it in, including the test case in the suite of automated test cases, monitoring the behavior of the test case next day etc.

Decision if to write an automated test case or not isn't any different from deciding if some bug needs to get fixed or not. IMHO you should not waste resources to achieve another percentage of code coverage if these resources can be used somewhere where it makes more sense. Remember, common sense rules ;-)

To be continued...

Posted: Mar 21 2004, 10:05 PM by gunnarku | with 9 comment(s)
Filed under:

Comments

TrackBack said:

Gunnar Kudrjavets has written an interesting piece here on code coverage testing. My personal view is that 80% is kinda low, buthis examplesof why 100% is difficultbecause of
# March 22, 2004 7:32 AM

Darrell said:

Good reference to Brian Marick's writing. It's amazing that most people think you need 100% code coverage. Even some very intelligent people have made this silly statement.
# March 22, 2004 8:44 AM

TrackBack said:

# March 22, 2004 4:02 PM

TrackBack said:

Code testing
# May 1, 2004 11:00 PM

TrackBack said:

Nico - C#deSamurai
# May 6, 2004 5:06 PM

TrackBack said:

MSDN Flash, April 2004 Edition
# May 6, 2004 5:07 PM

Shmuel Ur said:

When you, as a developer, do code coverage to check you unit testing then on each line you need either coverage or inspection. If some lines are very hard to get too, as you have examples in the article, then inspect them. It is true it is hard to get coverage on error handling but you need to realize that it is not tested and do something else. 100% coverage means that you either covered or did something else on each coverage task.
# May 16, 2004 1:16 AM

Shmuel Ur said:

When you, as a developer, do code coverage to check you unit testing then on each line you need either coverage or inspection. If some lines are very hard to get too, as you have examples in the article, then inspect them. It is true it is hard to get coverage on error handling but you need to realize that it is not tested and do something else. 100% coverage means that you either covered or did something else on each coverage task.
# May 16, 2004 1:17 AM

Vishvanath said:

I used Magellan in my previous project for Microsoft.. Now i don't know from where to get it, ... any idea?

# June 3, 2009 7:18 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)