About the code coverage

This article was published more than 6 months ago, this means the content may be out of date or no longer relevant.

When starting to write unit tests in a project and we want to have some test metric, we usually start with the code coverage. This indicator is very used, but it’s also very criticized.

The code coverage is a measure of the executed source code lines during the test process. It means it counts the production code lines browses by tests. The problem is that lines are counted even if there is no test assertions.

For example, taking this code:

function calculate(string $op, int $x, int $y): int
{
    return match ($op) {
        '+' => $x + $y,
        '-' => $x - $y,
        '*' => $x * $y,
        '/' => $x / $y,
        default => throw new \InvalidArgumentException(),
    };
}

And its associated test:

class CalculatorTest extends TestCase
{
    public function testCalculateMethod(): void
    {
        $resultat = calculate('+', 1, 5);

        $this->assertIsInt($resultat);
    }
}

We obtain the following code coverage result:

Summary:
  Classes: 100.00% (1/1)
  Methods: 100.00% (1/1)
  Lines:   100.00% (7/7)

The test has no pertinent assertion, but we have 100% code coverage. That’s why code coverage is criticized.

This is also why we should consider code coverage as a project negative indicator and not as a quality indicator. A project with a bad code coverage indicates missing tests. Whereas, if it has a good code coverage, it is not enough to measure the test quality. We should complete it with other metrics.

Note that there is another metric related to the code coverage: the branch coverage. It consists to evaluate and check if each control structure (such as if or case statements) has been executed.

It we take our previous example, the branch coverage result would be:

Summary:
  Classes:  0.00% (0/1)
  Methods:  0.00% (0/1)
  Paths:    20.00% (1/5)
  Branches: 42.86% (3/7)
  Lines:    100.00% (7/7)

We can easily see the difference. Every line of code has been executed, but tests only explored one of the five possible execution paths. This clearly indicates that tests are not relevant because they cover a few behaviors.

As we saw, branch coverage is a better indicator than code coverage. Nevertheless, we should be careful as it is expensive to calculate (time, CPU, memory, …) and it can slow your test suite.