Coverage Reports as a Code Reading Tool

2019-08-20

Coverage reports are widely used to visualize the lines of code which are covered by test cases. Often this is used in CI to block merge requests which lower test coverage by some metric. But coverage reports don't have to be test coverage reports. In general, the idea of "run some code and see which lines are executed" can be applied to anything, not just the test cases.

How can coverage reports help me read code?

I like to approach code reading with the goal of understanding a particular behavior of the program. Coverage reports can help illustrate which lines of code are responsible for a particular behavior. Example behaviors this could be applied to are listed below:

Command line applications
- passing an additional command line flag
- providing a particular type of input data
Graphical application
- clicking a button with the mouse
- pressing a key on the keyboard
- changing the window size

The Tools

There are many tools for collecting code coverage data, but I'm going to use kcov. kcov works on any binary which is compiled with DWARF debug symbols, which suits my use case for performing these techniques against Rust binaries. Install instructions for kcov are here.

If kcov doesn't work in your situation, you can likely find an alternative. Just be sure to look for a code coverage tool which allows collecting coverage data while running the program itself, not just the test cases.

kcov can generate nice coverage reports in a variety of formats, but it is not able to diff two coverage reports. I use pycobertura for this. Instructions for installing pycobertura can be found here.

Lets do it

There are many ways to apply this technique, but for purposes of explanation I'll choose a single use case to walk through:

I am trying to read and understand the Alacritty codebase. Specifically, I want to understand the code which handles keyboard input.

Alacritty is a GUI application written in Rust. As an introduction to the code, I'd like to know which lines of code are run to handle a single keypress.

The first step of course is getting Alacritty building from source locally on my machine. Once that is done (cargo build completes successfully) you can start Alacritty with kcov collecting coverage data as follows:

kcov --include-path=alacritty_terminal/src \
	target/cov \
	target/debug/alacritty

This says run the binary target/debug/alacritty with kcov recording coverage data into a directory target/cov, and only report coverage data for source code in the alacritty_terminal/src directory (don't include dependencies). Alacritty should open a little slower than normal, but otherwise run like normal. Now press a key on the keyboard and then close Alacritty.

Check the target/cov directory to see the coverage reports generated by kcov. I recommend starting by opening index.html in your browser. Looking at this report, you'll find that roughly 50% of the total Alacritty lines of code were executed in this minimal interaction. While cutting the amount of code you need to read in half might be helpful, its not the best we can do.

The bulk of the lines executed during our test above were run during the course of the normal startup behavior of Alacritty, and if we could somehow filter out those lines then we could really narrow things down to just the code responsible for handling our keypress. To pull this off we'll make two coverage reports: in the first we'll start Alacritty, only allowing it to initialize before closing it. In the second we'll start Alacritty, then press a single key, then close it. These two runs can use the same kcov command written above, you'll just need to change the output directory. I chose target/cov_min and target/cov_keypress.

Looking at the HTML coverage report (the report produced by kcov) for each of these runs, you can see that the second run executed only a couple hundred more lines than the first. The important question now is, specifically which lines are covered in the second run but not the first, as it is in here that we will find our answer about how Alacritty is handling our keypress. For this we'll use pycobertura.

pycobertura has a diff mode, which compares two cobertura coverage reports and creates a new report. This report can be generated as follows:

pycobertura diff \
	--format html \
	--output coverage.html \
	target/cov_min/alacritty.4d1e73ad/cobertura.xml \
	target/cov_keypress/alacritty.4d1e73ad/cobertura.xml

This command generates a coverage.html file in the directory you ran it. Open that file in your browser and you'll see in green the lines of source code which were executed in the second run but not the first. As expected, this is less than 200 lines of code.

A word of caution

These 200 lines aren't necessarily the only code responsible for handling a keypress and rendering a character on the screen (in fact, they are almost certainly not). They represent the code which is unique to handling the keypress - that is, code which is executed when handling a keypress but which is not executed as part of the normal startup. For that reason, I recommend using this technique to find interesting starting points of exploration into the complete codebase.

A faster method

Once you are comfortable with this technique, there is a slightly faster method you can use. kcov actually generates its reports in realtime, while your program is running. Given this, it is possible to capture both of our cobertura reports in a single kcov run, by copying the cobertura.xml file out of the kcov output directory after startup, and then again (with a different name) after pressing a key. I find this method actually works even better at minimizing the pycobertura diff.

Inspiration

This technique is heavily inspired by the talk Idealized Commit Logs: Code Simplification via Program Slicing by Alan Shreve.

Conclusion

Code coverage tools aren't limited to reporting test coverage, you can also record coverage data while running the actual application. By diffing the results of two of these recordings, it is possible to determine which lines of code are uniquely responsible for a particular behavior within the application. This can serve as a great starting point when reading a new-to-you codebase.