Apr 18, 2015

TDD, Code review and Economics of Software Quality

To understand the value of Junits (developer tests), try maintaining, or worst, refactoring a code base that has none. The cost of  maintaining such code is so high, that in most cases, it gets replaced instead of being improved or enhanced. The developer tests leads to ease of  maintenance and thus enable change. They are now a critical part of software development, most enterprises have adopted them and have moved from "no" tests to "some" test organization, but the road beyond that is unclear.  The industry prescribed techniques (Uncle Bob TDD rules and 100 % code coverage) are difficult to adopt for large enterprises which have a massive code base and globally distributed teams. Enterprises needs a way to standardize testing practices, which can be easily implemented and enforced across internal development teams and external outsourced development partners.

Code Coverage metric provides an ability to define a specific coverage target and measure it in an automated way, but it has its own limitations.
The developer tests, are not cheap, being "developer" tests, they take developer’s time and effort, which would otherwise be spent on adding features and functions. A large test suite would increase development  cost due to increase in test execution time and it would have its own maintenance overhead.

Whenever tests are written merely to attain high coverage, they lead to excessive tests for trivial and obvious functionality, and insufficient tests for critical or change-prone code. Also, not all code needs same coverage, there might be framework and boilerplate code that does not require extensive coverage, whereas some code may need more than 100% code coverage, such as test with a wide range of dataset. There might be other project specific attributes that influence test coverage too. Therefore, a flat coverage target may not work in all situations.

The other issue of writing tests for coverage is that the tests are retro-fitted as compared to test first approach of TDD. It is not only challenging to write tests for a code that was not designed for testability but also, the benefits of TDD (test first) are not realized.

TDD is a code design process that produces testable and high-quality code. In TDD, the developer is not just  implementing the feature, but by writing tests, he is also designing modular, decoupled and testable code. A developer would find it hard to test a unit that is doing too much or is tightly coupled with other units, and would be forced to refine the code. The multiple iterations of writing code, tests and refactoring, also leads to better self-review. The developer invests  a lot more thought in the code design and would find issues early that would otherwise go undetected.

When tests are retro-fitted, these benefits are not realized. Retrofitting tests is about documenting what the code does, rather than using tests as a code design tool. But doing TDD for all the code, all the time can slow down the development. Not all code is critical enough to need TDD,  some tests can be retrofitted, like integration tests. In order to expedite development, the application can be released for integration (UI) and QA, and integration tests can be added later to document the system behavior.

So how does one verify that TDD is practiced when and where required, and there is sufficient coverage for the code when and where required. I think instead of relying on automated tools, the code review process can be expanded to review tests for quality, coverage and TDD practices.

The coverage tools and TDD cannot check the quality of tests, ie if tests properly assert and verify the code. Only a manual review can check such test quality issues.

The review process would also promote TDD. If the code submitted for review has no tests, it would  suggest that the developer did not consider testing while development and the code might not be testable (maintainable), also self-review did not occur. The reviewer can reject such code since if the code is important enough to be peer reviewed, it is important enough to be self-reviewed. The reviewer can check if the coverage for the code is sufficient or unnecessary.

This review would also increase the efficiency of the code review. The code  reviewer would review the code in the context of the tests and get better understanding of the code, thus providing better feedback.

The cost effective way to achieve software testability is to promote TDD, and instead of relying on automated tools, piggyback on existing code review process to promote and ensure TDD.