Software developer blog

Test driven development and unit testing frameworks

In the last few years I've been using test driven development, and grew really accustomed to it. TDD not only helps in avoiding strange errors, it also helps in software design, implementation and enforces a pro-active approach towards debugging. At first it may seem that writing tests for each and every small part of the code is time well wasted, but in fact it is definitely worth the effort, especially if an automated unit test framework is used. As not everyone may be convinced at first, here is a quick summary about how TDD works, how it improves productivity, and what are it's pros and cons.

The work-flow starts by writing the test using an automated unit test framework, and see it fail. There are several reasons for first writing the test, and implementing the feature afterwards, but I'll get to that later. After the test fails as it should, implement the new feature, and test again. If the test still fails, fix the code, and test again. If your new feature works, but tests of earlier features fail, than see how the broken code relies on the functionality you have changed, and make sure that you fix all bugs. Once all test pass, commit your code to the version control system, and continue by writing the next test, for your next feature. It is important, that you always write only as much code, as necessary to pass the test.

Okay... now rewind a bit, and let's talk about why is it good practice to write the test first?

  • By writing the test, and seeing it fail, you can make sure, that the test actually does what you expect: it fails when the functionality is not present,or not working correctly.
  • Because you are encouraged to write only as much code, as necessary to pass the test, unit tests written as part of TDD tend to cover each and every line of code.
  • While writing the test you put yourself into the position of your codes future client. Remember the #1 rule? You should always make sure that your code is easy to use correctly, and hard to use incorrectly. By acting as your own future client, you make sure that you design the interface with the client in mind.
  • When the test is ready you have an example usage of your new class/function, which can also help in understanding the task.
  • If your test is well written it will also contain extremities, situations in which your code should return with a meaning full error. That also helps you remember what properties of  the input values should be checked.

These are obviously merits of TDD, but it doesn't stop here:

  • You can have a larger confidence in the code: every single line is tested right before it goes into production.
  • Development can be made in baby steps. And baby steps are less prone to error. Without the unit test you may need to implement each case branch at once, because the part of the program calling the code under maintenance uses all branches. That also means testing them at once, and when there is an error it's your task to find which branch failed. With a unit test you can test each path of execution separately, and change them separately.
  • It makes debugging simple, since once the input causing defective behavior is added to the test set, the unit test framework should pin point the source of problems.
  • It enforces a pro-active approach in fixing broken code, thereby significantly reducing chance of bugs. Without unit tests, broken code may not be discovered right away. It's usually not possible to test every control flow globally, especially not with black-box testing, so the broken function may even make it into a final release, and blow up in the face of your most important customer. That usually leads to a hectic bug fixing by caffeine intoxicated and wired developers waisting tons of time on finding the bit of broken code hidden as deep as it possibly can be.  On the other hand unit tests make sure that the first time a part of the existing code base is affected by a change, it will get noticed, and fixed. I once lead a team of 3 on an 6 month project, and the client couldn't find more than 3 or 4 bugs in the delivered code, and fixing them didn't take more than a day.
  • When developing a new feature, the non TDD approach is to add log messages to the code, and see if we get what we expected. The log messages are either kept, or removed. In case of scripting languages keeping the log messages has an obvious performance penalty. In case of  languages like C++, where one can remove the log messages from the production release with preprocessor directives, it is not such a big problem, but it does litter the code base with unnecessary lines, that can't even be indented properly.  Even if all log messages are kept, one usually ends up with a long winding list of log messages that has to be manually analyzed. On the other hand by using an automated unit testing framework, one can define what is normal and expected behavior, and the test will only report abnormalities. What's more, tests are not part of the production code, so they don't liter it, neither affect  performance, so it is possible to keep all of them to aid any future changes, and bug fixing.
  • It can be annoying to test new features of complex applications. After each change you may need to wait for execution to reach the function you are implementing, and that can take a lot of time. It's even worse when one is testing an application through it's GUI, since one has to click through the same steps over and over. With an automated unit test framework you can test just the single function you are working on. What's more, your application in general may need large amount of data to work, which needs to be loaded for each run, even if the specific function you are working on is one with a single input parameter of built in type.
  • In some cases you may need to work just a little bit more, then just implement the new functionally, and TDD can help you with that as well. Your first implementation may have some short comings in many ways: either it's hard to read/understand, or it may be suboptimal in performance, or you may realize that certain parts can be generalized, and you can re-factor the code to remove repetition of concepts. With TDD you can make all necessary changes without fear of unintentionally breaking things, since you can always run your automated tests, to make sure that even the tiniest function is unaffected.
  • It enforces a "Keep it simple and stupid" approach. Since you want to make sure that every part of your code is correct, one tends to write smaller functions, usually not more than a dozen lines. That has several advantages: larger functions are more prone to contain code duplications and errors, they are harder to maintain and understand. For some of you the "function call is expensive" alert may have gone off at this point, but here is something that should comfort you: smaller functions, that are called regularly will stay in the CPUs cache, while larger functions will mostly be accessed from the regular RAM. That usually compensates for the cost of function call, and in some cases it even improves performance despite of the extra function calls. Note however that your code optimizer can in line functions when it is a good idea (and given that the function definition is available at compile time) but it can not re-factor large functions into smaller ones.
  • Since unit tests work on units, it becomes important to have well separated independent objects. That in return enforces strong object oriented concepts. With unit tests in mind one goes further in terms of encapsulation, isolation, and design patterns. Without unit tests one more easily settles for "just works" in favor of  "versatile, and easy to use".
  • Since it is a daunting task to set up many global variables, and other dependencies, software designers having TDD in mind will try harder to avoid them. That also means better code re usability, and more carefully designed public interfaces. By enforcing unit tests, one can also avoid hidden dependencies, that arouse from unintentional overuse of global and static variables. Since the unit tests fixture has to take care of all dependencies, developers will be encouraged to avoid them, while without them the quick and dirty solution may involve adding yet another global variable.
  • As mentioned later among cons of TDD, it is hard to test database, file operation, network communication, and GUI related code. On the other hand that encourages the designers and developers of the application to isolate these parts of the code as far as possible. This is known as the Model-View-Controller pattern, where the model encapsulates database, and file operations along with possible network opertations, the view delivers the result for the user, while the controller takes care of the actual work. Controller tests can be feed data and have their return values checked directly, while model tests will act on a mocks. GUI tests are a bit harder to automate, but it is still possible. So far that seems to solve the problem best for the controller part, and that in return means that designs optimized for TDD will inevitably reduce the complexity of the model and the GUI. And here comes the gain: since the model and the view is so well isolated and small for testing purposes, to port the application to another platform one only needs to change a very small part of the code base. The same goes for introducing compatibility with new file formats, or database engines.
  • As mentioned earlier the tests somewhat become an example usage, and it is always up to date, since otherwise it would fail. That also means, new comers of your project will have a perfect place to check how the public interface of each class works. Add comments into the picture, and you have a full and up to date API documentation. If you are also using doxygen writing documentation is not a task anymore: it's part of the development process.

Cons... well... more like potential vulnerabilities:

  • As mentioned before, it's hard to test the model and the view. This usually also means, that they are designed to be as simple as possible, but that still doesn't make testing them less important. It is possible to use mock objects, to test the model. Simulating file operations are simple, database operations, and network access are harder. In some cases a test database containing small amounts of data is used as a mock for the database, but the best solution in many cases is to create a simple class, that has a map describing appropriate behavior for a pre-defined set of inputs.
  • Since the writer of the test and the feature is the same person, the same issues may get overlooked. For that very reason it is important to have strong code reviewing culture withing the project, that also includes a review of the tests. Reviewers should feel encouraged to add extra tests if they spot a vulnerability, or even just randomly to make sure a new client can not easily invoke situations the developer didn't think of.
  • Since unit tests rely on the public interface of respective units, correcting a bad design decision that introduced a large amount of dependencies may break way too many tests. So when using TDD always think carefully about your design, and take extra care about dependencies.
  • There is a dispute among TDD fans, if private and protected members should be tested separately. I'm torn on that myself... for private and protected members to get tested one has to friend the class with the unit testing framework, which is an obvious littering of production code with test code. Also private functions are expected to be subjects of regular changes, hence a simple code optimization or refactoring may break unit tests, when the intention would be to use them as a reference for correct behavior of the public interface. On the other hand even the tiniest unit is worth testing. Bottom line: do as you wish, but if you do test private functions, it might be a good idea, to separate them from tests of the public interface.
  • It is easily overlooked, but due to the mocks used, integration test becomes imperative. It is of course still useful, since it can, and will narrow down the location of the error in some situations, thereby reducing debugging time, but it will not replace integration test.
  • The management may first come to the conclusion, that writing tests at unit level is an extra task, that they are not willing to pay for. In those cases it can be hard to convince them, that this extra work is not without it's benefits. (Honestly, it wasn't apparent for me either before I actually tried TDD.) Below I'll tell you about a way to convince them. It may not always work, but worth a try.
  • While with "greenfield" projects it is easy to start using a TDD approach, and it's merits become apparent almost immediately, projects handling a large base of legacy code may need more time to adopt. In those cases, the best way to shift to TDD slowly, adding unit tests for a function when it is first modified. It usually also involves some code refactoring, so that unit tests are possible, but on the other hand it usually improves code maintainability in the long term anyway. It's even better to try TDD with a start up project, and implement the technique on legacy code, once the team and management is convinced about the potential gain.

Code base for a TDD project contains the tests, and as mentioned among the cons, that also means that the management may be reluctant to enforce TDD. Since it's hard to quantify how much work it is to write the tests, and how much work it saves by it's merits, it is also pretty hard to come up with a convincing power point demonstration that proves without doubt, that TDD is a productivity enhancer. What's more, in the first few weeks after TDD is implemented as a development process, it may still look like an overhead. So how to convince the infidel?

Once Martin Fowler said: "Whenever you are tempted to type something into a print statement or a debugger expression, write it as a test instead." Well... that is still not quite the TDD approach, but it does result in a number of automated unit tests. I might even be somewhat organized. Since this approach only suggests, that an already existing task be done differently, and in a more organized meaner, no sane manager would say no to that. Still test will not cover each line - even though with TDD it is almost sure - the existing tests will suffice to speed up the development cycle, and significantly decrease the number of bugs reported during integration tests, and after delivery. Those on the other hand are a well measurable values, that may easily convince people about TDDs real merits. Since the framework is already available, and the project has already shifted towards a test friendly design, it is relatively easy to start enforcing TDD at this point.

As I have mentioned at the beginning, this is mostly a summary, with my own thoughts added. So partly as a reference, as well as links for further reading, here are the materials I have read, and found useful on the topic:

And of course after talking so much about the wonderful automated unit test frameworks, here are the ones I've been using, or heard good things about:

For a more comprehensive list of frameworks, consult Wikipedia.

In some cases you may need to work just a little bit more, then just implement the new functionally, and TDD can help you with that as well. Your first implementation may have some short comings in many ways: either it's hard to read/understand, or it may be suboptimal in performance, or you may realize that certain parts can be generalized, and you can re-factor the code to remove repetition of concepts. With TDD you can make all necessary changes without fear of unintentionally breaking things, since you can always run your automated tests, to make sure that even the tiniest function is unaffected.

Edit: I just found a pretty good slideshow on the topic. If you prefer small bits of information to longer reads, than this one is for you.

@ //