100% test coverage may be impossible, but this week I’ve found that striving for it has many benefits.
I’ve been working on a new project that’s based on the Pyramid framework. The framework creators claim 99% test coverage. The other developer and I thought that was pretty great, and decided that setting a similar goal for our project would be worth the trouble.
I don’t think I need to go into intimate details about why tests of the various forms (unit, functional, integration, systems, load, stress) are important (I will if somebody asks :)). I think it can be summed it up in a simple truth:
In the long run, writing tests saves you time, and makes your code stronger.
I’ve spent most of this week refactoring tests in an effort to get to that mythical 100% goal, and I’ve found the experience to be extremely fruitful.
Note: we’re using Nose to run the tests and generate the coverage report.
I’ve learned a lot.
I know much more about how Pyramid works (and it’s been refreshingly straight-forward). I’ve become a fan of WebTest, and intimately familiar with WebOb. All of this was a side-effect of the need to break down the tests into units that could be adequately run at the various levels, to touch all of the code. I had to dig into the APIs for each package to figure out how to test certain things, and again, at each level. Luckily, these particular packages are very well documented, and their code is relatively clean and easy to understand, so the process was much less arduous than it might have been given a different toolset.
I had to dig into protocol-level aspects of the application. Some of more opaque parts of Pyramid, WebOb, WSGI and HTTP required closer inspection. After this week, I know so much more about multi-part form encoding (MIME encoding technically, RFC 2046), file uploads (RFC 1867), and perhaps more importantly, how Pyramid and WebOb implement these concepts.
The separation between unit and functional testing became more obvious, and more than ever, I appreciate the need for them to be separate.
The code’s gotten better.
When faced with a line of code that wasn’t covered, I had to figure out why. The answer to ‘why’ forced me to re-evaluate the way the tests were written and think critically about how the application is structured.
This is, of course, in addition to the expected benefit of catching untested code. That untested code manifested in several ways:
- I found a handful of use cases that just weren’t being tested.
- I found some edge cases that were not being tested (accounted for in the code, but not being executed by the tests)
- I found API interfaces that were implemented but never executed.
- I found tests that were passing, but not testing what they were supposed to test.
- I found one specific edge case that was accounted for in the code, but it turned out to be effectively impossible.
What this amounts to is exposure of: false assumptions, bad tests, and some potential design flaws. In fixing and/or addressing them, the code has attained a new level of solidity. And even the tests themselves are a lot better than they were last week.
I’d have to say working toward 100% coverage, even if you can’t get there, is more than worth the effort. Just exposing one or two bad tests or use cases that were neglected is worth it, and I gained a lot more than that from this endeavor.
I started writing this yesterday morning, when I had one or two stubborn lines of code that kept eluding coverage. As of about 6 o’clock last night, I’ve actually achieved my goal of 100% test coverage. I need to do a bit more digging to make sure this isn’t a misleading number, but given what’s gone into getting here, I feel pretty confident that even if it’s not as impressive an achievement as it sounds, that it is a reflection of a very good test suite, and a well put together application.