Unit tests are good for you and good for your projects. They prevent errors and document the intent and expectations for past work. Having them keeps your projects running smoothly. But it seems that hardly anyone teaches how to create them. I'll briefly discuss some of the theory behind unit testing, and then move on to how they should be designed, written, organized, and executed. And finally I'll wrap up with some discussion of how unit tests can inform and improve the design of your application.

This article is a write up of a talk I gave for Women Who Code DFW in February 2019.
Update: If you'd like to hear it yourself, the video is on Youtube.

Why do we unit test?

It hopefully seems obvious that you should test your software. And in particular that you should unit test your software. But, why? It's not just because I said so. It's not just because you're supposed to. But, so often that's how it's treated. It's something that you're supposed to do. Sort of like sending thank you cards. Everyone knows they should—that it's the proper thing to do—but they're just too busy and being proper isn't something they value. Unit tests are not like thank you cards. Testing your software in general is about gaining confidence that it works as intended. Confidence allows you to make assurances about the quality and consistency of the application. And it allows you to make similar assurances that your development process is predictable and repeatable. Those qualities are what keep you from being woken up in the middle of the night by surprise outages. And they're what allow you to realistically estimate and plan your work. When you confidently know how long things will take, then you can confidently tell the stake holders how much capacity you have. And when they ask for more, you can show them that there simply isn't more and work from there. Consistent, reliable, and thorough testing is how you gain that confidence. And unit tests are the fastest, most reliable, and most informative tests you can write. They are also usually the only tests that you can guarantee will be run. Basically, unit tests keep your software healthy. They're like vegetables. You have to eat your vegetables, and not just because I say so. Although I do say so.

How do we unit test?

Unit tests are just functions that call your application code. Those functions also need to test and report on the results of that call. So at the most basic level, unit testing just consists of writing those functions. If you managed to write the app code, then writing test code is not a big deal. It does come with some of its own concerns though, which for some reason no one ever teaches in schools.

Unit tests need to have a test runner. This is also sometimes called a test harness or framework. All of those terms mean basically the same thing. Unit tests are functions, and the test runner is the program that will execute your test functions. That's all. It will probably require you to use some certain syntax or coding conventions in order to register your test functions and/or make them discoverable to the test runner. So you generally need to know which test runner you're using, but the basic concepts are all the same. The other thing that unit tests need is a way to indicate that the test has passed or failed. For that, you need an assertion library. Most of the time, there will be one provided by your test runner and so you don't need to worry about it.

Those are the ingredients; let's put them together. You may be familiar with the DRY principle (Don't Repeat Yourself). This is good advice generally, but tests are a slightly special case. In the case of tests, you should also keep in mind the DAMP principle (Descriptive and Meaningful Phrases). The idea is that the purpose and behavior of a test function should be obvious without having to reference any other part of the source code. In order to achieve that, some amount of code duplication in test functions is acceptable. In real application code Don't Repeat Yourself continues to be the best advice. Beyond being easily understandable, tests need to give clear and useful results. That's mostly a matter of organization and naming. Tests should be named to describe the scenario they're focused on. The scenario is probably a specific module, behavior, and input. For example: Authentication denied when password is empty. When this test fails, it is immediately clear where the problem is and what was expected. And finally, tests must be capable of failing. This is hopefully obvious, but you would be amazed how many tests I've seen that could only fail if the app throws an exception, if even then.

Test functions should generally be quite short. They should have a very narrow scope. The common pattern for designing unit tests is Arrange-Act-Assert. You begin by setting up the system under test and getting it into a testable state. This will likely be the most lengthy part of a test. Most test frameworks will give you a way to centralize the majority of your setup and do it outside of individual test methods. What exactly this means will depend on your application and the module you're testing, but it's probably mostly about setting up test data. The act step is simple: you execute the method that you're testing, with the appropriate input. And then assert is also straight forward, at least in theory: you observe the outcome of that execution and record whether it was correct or not.

In practice, making good assertions can actually be complicated. In order for a test to give clear results, it needs to make one single logical assertion. That doesn't mean that it needs to be only a single call of some assertion function. For instance, to check that invalid credentials don't result in a login you would probably want to check both the response object and the session object, like so: assert(response.statusCode).equals(403) and assert(session.userId).equals(null). The other reason that making good assertions can be challenging is in getting access to your module's internal state in order to inspect it. Or, in providing the various resources that your module requires in order to function. In the worst cases, that can mean things like providing working databases. Those kind of challenges can be mitigated somewhat by generating and using Test Double objects in place of the real dependencies.

Test Doubles are a collection various kinds of objects that simulate other objects for the purposes of testing. You may have seen them referred to as mocks or stubs or fakes before. Those are all test doubles with slightly different behavior. It's also splitting hairs a little bit, and those terms are all commonly misused so just know you're in good company if you can't tell or can't remember the difference between a mock and a stub. The goal of using these tools is to isolate the unit of code that you're testing from all of it's dependencies and control any side effects it produces. You want to unit test your code, not your dependencies. For the purpose of unit testing, you can assume that your databases and your network services and so on all work as expected. What you want to concern yourself with is the code that you're responsible for, and only in the smallest chunks you can manage. Finally, your tests should be concerned with the behavior of the system under test, not its implementation. If you find yourself wishing you could access the private internals of the object under test, that's a sign your test is going to be brittle. For example, if you suppose I'm testing a logging utility, I don't really care how my log messages are stored and manipulated internally to that utility. What I care about is that they are properly filtered and formatted when they get written to the logging destination. I would test this by using a mock output stream, and observing the writes that are performed on that stream.

Improving testability

The use of test doubles is primarily about eliminating side effects and hard to manage dependencies. The classic example of where this is both very important to do and very hard to do is interactions with a database. Actually writing to a database is a side effect that can make your tests unreliable. Unit tests should be able to run fully independently of each other, repeatedly, and in any order. Doing reads and writes on a real persistent data store is practically guaranteed to break that independence. And even if they don't that real life data store will make your tests slower and needs to be managed somehow. But at the same time, those parts of your code need to be tested. In fact, they are likely to be the most critical parts which could cause the most damage if they are not behaving as expected.

The solution to that conundrum is to design for testability. Tests which have difficulty controlling the system's dependencies are probably an indicator that the system (or at least that part of it) is tightly coupled to many things. Or that the system has not been designed following the single responsibility principle. Or both. Either way, the best solution is to refactor your application to make it more testable. The same qualities that make code testable  will also make it easier to reason about, and easier to maintain. Code that's testable is simply better than code that isn't. For one thing, it can be covered by reliable unit tests which will give you confidence that any future changes aren't breaking desired behavior. For another, it will have a small and clearly defined purpose which makes reading and understanding any given unit of code much easier. And it will have fewer dependencies and few if any side effects. Of course convincing your team and your stakeholders of that is another matter entirely.

Still, let's say that your team is on board and you've got the resources to do that refactor. There are a number of design patterns that can help in that regard. First, you should be adhering to SOLID principles. Learn them. Use them. Love them. Especially the Single responsibility principle. That's good general advice, but we can also talk about some specific patterns.

The Repository Pattern is useful when dealing with app data. A repository is a module that is responsible for managing and providing app data to the rest of the application. So the user authentication component doesn't need to interact with the database. Instead it calls the user repository to get a user matching a given auth token. The user repository is responsible for getting that data from wherever it's stored, and the auth component doesn't need to know or care how that's done.

Dependency Injection helps deal with dependencies (surprising, I know). DI is a design pattern where a module/component/whatever will expect that its dependencies be provided to it rather than knowing at design time what they are and retrieving them on its own. DI encouragest—and to some extent enforces—a looser coupling between components. At the most basic, DI is just a design pattern, and it looks like var Auth = new Authentication(UserRepository). There are also DI frameworks which can centralize the process and require fewer stylistic changes. In that case the Auth constructor doesn't require a special argument and instead internally will do something like this var repo = Injector.get(IUserRepository).

The Adapter Pattern helps abstract hard to manage dependencies. You might also hear this called a wrapper pattern, or just wrappers. So in this case, instead of relying directly on the database, the UserRepository will depend on a DatabaseAdapter. You would then have much more control over the adapter during testing than you would a raw database client object. It also gives you flexibility to support different kinds of databases, or even other kinds of data stores.

Aside from data, the other hard part of segregating your application's responsibilities is often the UI. The Model-View-Controller pattern solves that problem. It provides a framework for separating the concerns of managing application state and managing the UI. For testing, it allows you to test virtually all of your application without having to manipulate a UI, or to even create one.

Winning at unit tests

Retrofitting unit tests into an existing code base is hard. The better option is to have them from the beginning. And by beginning, I mean the actual beginning. Write the test first, and write the app code after. This kind of test-first process is called Test Driven Development (TDD). Or sometimes Behavior Driven Development. Or Acceptance Test Driven Development. These are all basically the same thing, and the difference is more hair splitting about what style the tests are written in. The important thing is that the tests come first, and application code follows. How that works in practice is often described as Red-Green-Refactor. The process is that first you write a test. This test will fail because there's no implementation. Among other things, this actually ensures that the test can fail. It will also show up red in your test reports. Then you add some implementation that passes the test. Now your report is green. And then you refactor to make a better test, or add more of them. And you rinse and repeat. The tests specify the required behavior, and they improve throughout development to better specify it. The application code is then from the beginning required to work in some way that a test can verify how it works. This is a bit of a change to get used to, but it works.


Cover photo by unsplash-logobruce mars