8 December 2009

Rigorous testing

By Andrew Clifford

I recently implemented a software release with a significant bug in it. Does this mean my testing approach is flawed?

This weekend I implemented a minor enhancement to the Metrici Advisor service. The enhancement added some new features to assessment forms. I had followed our testing procedure thoroughly, the new version passed all the tests, and I put it live.

I was emailed on Saturday morning by one of our more active users. The assessment form, when accessed from one part of the system, did not work at all.

This particular way of accessing the form is not heavily used, and it fails so gracefully that it is not obviously an error. I do not suppose anybody else had really noticed. The bug itself was trivial to fix - I had missed out a single line on a configuration file. But that is not the point. I still managed to put in a version with a significant bug, and my testing did not catch it.

I was concerned by this. Does this bug suggest that we should change our testing approach?

We have always taken the development and testing of Metrici Advisor very seriously. Our software provides advice on IT management. We have to practice what we preach. As a small company, we have to be able to compete with much larger companies who spend a lot more on development. We have to do things right to survive.

We develop the software in small increments. We rely on test-drive development. We have automated tests for every level of the system: for the underlying code, for high-level services, for stylesheets, and for the user interface. We amend these before or during coding. We rely on the testing 100%. Once a new version has passed the tests, we consider it good to go live.

In some ways, our approach might look unprofessional. We only spend 45 minutes testing the entire system. I have known teams to spend months testing systems of similar size. For minor enhancements, we do not carry out much acceptance testing. Obviously, as this weekend shows, our approach just occasionally misses a bug.

Is our approach flawed? Although the tests are not perfect, the process is rigorous. Once a test condition is identified, either as part of development or because of a bug, it becomes a permanent part of the test packs. The test packs are maintained in parallel to the code. We have invested heavily in testing: our test packs include more than 2500 tests, which works out as one test per 24 lines of code.

We could put more effort into manual testing of releases. But the chances of finding anything that 2500 automated tests have not found are very slim. Every minute that we spend testing manually would be better invested in improving the software and the automated testing. We have only had about 20 software bugs in production in almost four years. I have been reading about test driven design. By using a similar approach, we are achieving defect rates (0.3 defects per thousand lines of code) that are much lower than industry averages.

To answer my own question, I am not going to change our testing approach. Test-driven design and development really works. One bug does not undermine that.