Animated failure reports with Selenium and Cucumber

If a picture is worth a thousand words, then motionpictures are surely able to enhance the error reports.

Dávid Csákvári

6 mins

Automated acceptance tests usually consist of multiple steps that describes complex scenarios of user interaction. Sometimes it can be hard to determine the exact cause of the failure after the execution, when one such test fails. If a picture is worth a thousand words, then motionpictures are surely able to enhance the error reports.

A simple approach ...

With end-to-end tests that document features with Cucumber and simulate user interactions in the browser via the WebDriver API with Selenium, the simplest solution is to take a screenshot after a scenario is failed.

@After
public void embedScreenshot(Scenario scenario) {
	if (scenario.isFailed()) {
		try {
			byte[] screenshot = ((TakesScreenshot) FeatureTest.driver)
					.getScreenshotAs(OutputType.BYTES);
			scenario.embed(screenshot, "image/png");
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}

Although in many cases this approach provides sufficient information to reproduce the bug that caught by the test, in complex situations the problem might surface in an earlier step, causing the scenario to fail much later. This can be the case if the setup goes wrong undetected, or the actions described in the when steps do something really surprising that the author of the test did not anticipate. One way to collect the missing pieces of information is to rerun the tests in the development environment, which is really time consuming, especially if the said test is not fully deterministic, and the incorrect behavior cannot be reproduced in a single run. Finally, it does not help the communication between the development and the tester team as much as an animation would. And it's much less cooler.

A possible solution for these problems can be if a screenshot is taken after the execution of every step, rather than just for the failing one. If a scenario fails, then the captured images can be concatenated to an animated GIF that can be attached to the test report.

For a test suite built on Selenium and Cucumber, this can be achieved with a few extra components. At the time of writing Cucumber-JVM does not support step hooks, so to be able to do something after every step the first step is to implement a JUnit RunListener.

import org.junit.runner.Description;
import org.junit.runner.notification.RunListener;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;

public class ScreenshotRunListener extends RunListener {
	
	protected static GifAssembler gifAssembler = new GifAssembler();
		
	@Override
	public void testFinished(Description description) throws Exception {
		String details = description.getMethodName();
		byte[] screenshot = ((TakesScreenshot) FeatureTest.driver)
				.getScreenshotAs(OutputType.BYTES);
		gifAssembler.addFrame(details, screenshot);
	}
	
}

With Maven Surefire or Failsafe plugins it can be registered as follows:

<plugin>
   <groupId>org.apache.maven.plugins</groupId>
   <artifactId>maven-surefire-plugin</artifactId>
   <version>2.18.1</version>
   <configuration>
      <properties>
         <property>
            <name>listener</name>
            <value>hu.advancedweb.gifassembler.ScreenshotRunListener</value>
         </property>
      </properties>
   </configuration>
</plugin>

Then in case of a failure a Cucumber Hook can generate the animation from the screenshots and attach it to the report.

@After
public void embedScreenshot(Scenario scenario) {
	if (scenario.isFailed()) {
		try {
			byte[] animation = ScreenshotRunListener.gifAssembler.generate();
			scenario.embed(animation, "image/gif");
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
	ScreenshotRunListener.gifAssembler.clearFrames();
}

I think there are a few things that can enhance the animations:

If the last frame has a slight red border, then it can help indicate the location of the detected error in the animation.
Each frame should have the text of the actual step along with it's position in the scenario for better understanding of the events, as well as the name or short description of the executed scenario. This way the animated GIF can be used as an almost complete error report.

The screenshot transformation and GIF generation can be aided with the tools in the javax.imageio and java.awt packages. This code can be quite lengthy, at least my version is, as you can see here in the related repository. To increase the fun factor QA people might even consider including portrait images from the committer who broke the build as the last or first frame of the animation.

The result is something like this: Who cares about the logs?

... with some caveats

Although the solution is easy to implement, creating informative animations with low performance impact on the test execution is not simple, and there are some drawbacks of the described technique that may or may not be an issue, depending on what kind of suite you have.

One potential problem is, that if the animation frames are created immediately after the step execution, therefore in certain cases the effect of the step may not be visible on the animation. For example, if a step induces an AJAX call that populates a dynamic list, and the second step is to select an element in the list, then the items of the list may not appear in the animation. The screenshot creation happens before the call finishes, because it does not benefit from the implicit and explicit waits that take effect before the actions in the subsequent step.

Besides, at least two performance trade-offs should be considered using this technique.

Regardless that the scenario fails or not, a screenshot is created after every step for every scenario in the suite. Although creating a screenshot is not really expensive, this can make the execution of large test suites a bit slower, so in some cases one might have to look for alternative solutions, like rerunning the failed tests and only then take the screenshots.

Another possible problem is that the image manipulation can be measured in precious seconds, so one might decide carefully what transformations to apply on the collected screenshots. These transformations only necessary when a scenario fails, so as long as only a few scenario fails at a time, it's not a big problem. But when a core component is changed that can affect multiple tests, the report generation can really slow things down if many test fail. This can be mitigated if the animation is only created for the first few failures, and for the rest of the failing tests only a single screenshot is attached to the report. If a single modification causes a lot of failures, possibly enough information can be acquired by investigating some of the caused problems anyway. If this threshold can be parameterized easily, then the developers doing ATDD or testing something locally after fixing a bug won't be annoyed by the unnecessary GIF generation as they can switch it off.

Monitoring performance of the test suite in the Continuous Integration environment is crucial, as it can help detecting potential performance problems that arise when tweaking the test suite, or just upgrading the dependencies of the application.

So as it stands so far, I think for most real-life test suites the disadvantages of this approach outweigh it's advantages, so I'll look for alternatives.