Calm assertions with Spock
A post about some problems I've run into with Hamcrest and some solutions provided by the Spock Framework.
Expressing assertions are usually the toughest part of writing a test-case. The rest of the test exercises code that made to be used in production, but assertions are usually made of custom logic that lives only in tests to ensure the correct behaviour.
So, while I think it's natural to write the given and the when in most tests (not counting mocks), the then part require some attention to get it right.
- It has to be easy to read. Tests are documentation, and hopefully will be read more times than (re)written, so it's worth the effort.
- It has to be easy to write. Don't let some technical stuff get in the way of writing tests and halt the flow of development.
- Produce meaningful error messages. This is important to reduce the need to debug tests, especially when learning a new API or investigating problems caused by changes in the system under test.
The Problem
To express assertions for different kinds of tests in Java I mostly use Hamcrest, which is an excellent library that provides an easily extensible DSL that can be used to define complex assertion rules. The basic idea is really simple: one can use or even define small building blocks called Matchers. The Matchers are essentially functions that can work with the subject of the test, implemented by immutable objects that does not manage any state. They can be combined, and the behaviour of the assertion is essentially determined by the combination of these objects.
It's really flexible and extensible, but for me this kind of functional composition always felt a bit unnatural to Java. While the core concepts are simple, it has a steep learning curve, and it's not always trivial to please the Java type checker while specifying the test assertions with Matchers. For me at least, usually there is a non-trivial mental transition between thinking about what to do and how to express it.
The Treatment
There are a lot of solutions for these problems. I thought I give the Spock Framework a try for testing Java applications, mainly seeking some relief from writing complicated assertions.
First dosage
Setting up Spock is pretty easy in almost any build environments as the official example project provides guidance for many tools.
To make it work with Maven one has to declare org.spockframework:spock-core as test dependency to the project, and set up the appropriate plugin to compile the groovy code. In theory, the other plugins seen in the example pom.xml are not mandatory, but one better set the useFile configuration parameter to false for maven-surefire-plugin, as the default setting prevents Spock's detailed error messages from showing.
So, a minimalist configuration looks as follows:
<build>
<plugins>
<plugin>
<groupId>org.codehaus.gmavenplus</groupId>
<artifactId>gmavenplus-plugin</artifactId>
<version>1.4</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.6</version>
<configuration>
<useFile>false</useFile>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.spockframework</groupId>
<artifactId>spock-core</artifactId>
<version>1.0-groovy-2.4</version>
<scope>test</scope>
</dependency>
</dependencies>
To get started, one has to extend the development environment to support Groovy. Most IDEs has plugins for the language that integrates nicely to the existing ecosystem. I tried Eclipse Mars with the Groovy feature installed, and its almost as natural as editing Java with JDT. There are a few glitches, like navigating back and forth in the last edited locations can go haywire sometimes when jumping between Java and Groovy source codes, but usually it works fine, and most of the essential functions are in place, like code completion, jump to method definition and refactoring, even if the other side of the call is implemented in Java.
The therapy
Although I am just started experimenting with Spock, I'd like to share my findings with you in this post.
The most obvious benefit that one immediately notices is that most assertions can be expressed without any kind of DSL. With no extra testing API involved, test are more straightforward to write and easier to read.
For example, let's assume that regarding our custom implementation we always expect the next random value to be different from the previous one, and we'd like to write a test for it. (I know, testing a random generator is usually a bad idea but bear with me for the example.)
The required code in Java with Hamcrest:
@Test
public void should_produce_different_values() throws Exception {
// Given
MyRandom myRandom = new MyRandom();
// When
int random1 = myRandom.getRandomNumber();
int random2 = myRandom.getRandomNumber();
// Then
assertThat(random1, not(equalTo(random2)));
}
While the corresponding test case in Groovy with Spock:
def "should produce different values"() {
given:
MyRandom myRandom = new MyRandom();
when:
int random1 = myRandom.getRandomNumber();
int random2 = myRandom.getRandomNumber();
then:
random1 != random2;
}
As you can see in the example above, Spock encourages BDD style tests written in given-when-then fashion, providing an opportunity to naturally divide these parts. Integrating with Java code, it's also useful that Groovy can be written in a very similar syntax. This way the tests can be used as living examples of using the API under test as if they were written in Java.
In the first example, one has to look up the documentation for available matchers as it's functional-composition based nature makes hard times for IDEs when it comes to suggestions. This is not the case in the Spock version, it's easy to write and read.
While in many cases it's not necessary, in complicated situations the DSL provided by Hamcrest matchers can make the test easier to read. With Spock, you can still use them if you'd like.
then:
that myList, containsInAnyOrder("hello", "world")
Although it's hard to achieve for all cases, I think it's better to strive for asserts made from calls just to regular APIs:
then:
myList.containsAll(["world", "hello"])
Keeping it simple can save one from trying to use the wrong word in the DSL, (for example, Matchers.hasItem vs. Matchers.contains) or from trying to force an expression down the throat of the compiler.
Another clear benefit is that Spock produces detailed error messages when a test breaks. This is also true for Hamcrest matchers, but I think they are a bit harder to decode.
Suppose that the MyRandom in the first example is done by someone who value a classic joke more than a faithful random implementation.
The assertion error generated by Hamcrest and Spock tests, respectively:
java.lang.AssertionError:
Expected: not <4>
but: was <4>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at hu.awm.MyRandomTest.should_produce_different_values(MyRandomTest.java:18)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
... and many more ...
Condition not satisfied:
random1 != random2
| | |
4 | 4
false
at MyRandomTest.should produce different values(MyRandomTest.groovy:13
Both framework tells the important stuff, but the output produced by Spock is much clearer, containing almost no noise.
john.getAge() > jane.getAge() || john.getName() == jane.getName();
Condition not satisfied:
john.getAge() > jane.getAge() || john.getName() == jane.getName()
| | | | | | | | | | |
| 20 | | 21 | | John Doe | | Jane Doe
| | | | | | Person [name=Jane Doe, age=21]
| | | | | false
| | | | | 3 differences (62% similarity)
| | | | | J(oh)n(-) Doe
| | | | | J(a-)n(e) Doe
| | | | Person [name=John Doe, age=20]
| | | false
| | Person [name=Jane Doe, age=21]
| false
Person [name=John Doe, age=20]
at hu.awm.MyRandomTest.should produce different values(PersonTest.groovy:11)
The output contains the string representation of all objects affected by the assertion, even if it contains multiple logical parts. With Java/Hamcrest, one would have to make assertions for every interesting piece, or have to create a custom matcher to achieve similar effect.
Using Hamcrest matchers with Spock produces both error outputs, which is really useful. One can use complex matchers without having to give up the benefits of Spock's detailed error reporting.
Cautions and side effects
I think Hamcrest is a really good library, and the issues mentioned about it can be mitigated in many ways, and there is a chance that these things bother me more than others. The goal of this post was not to discourage it's usage.
I am really in the beginning of piloting Spock, so everything is unicorns and rainbows now, so I think there will be a follow-up post about it, when things get settled. So far, it seems that using it has a lot of clear advantages, and it is really easy to get started with, I recommend trying it if you haven't already.
There are a few things to watch out for. Groovy might be designed to cope well with Java, but it's still a different language. It requires it's own tools and expertise, which might make it an unsuitable choice in some cases. Also keep the key differences between Groovy and Java in mind.
Spock and Groovy makes writing assertions fun and simple, which is a really important factor to motivate people to write tests.