End the war with flaky tests with Gunjan Arora

Meet Gunjan Arora, our Principal Quality Engineer at Nine. She shares her tips on beating flaky tests.

Picture this – You’re just finishing your morning coffee as your automated test suite pings to indicate that it’s completed its run job. You notice that a couple of tests have failed and you check to see what they are with curiosity … and then your heart sinks as you realise that it’s a test that has gone off incorrectly in the past.

Is it really a bug? 

Or a false positive? 

How frustrating!

If you have experienced this, you will know the pain of a flaky test.

A flaky test is a test that could fail or pass for the same configuration. Ideally, a test should constantly pass or fail if no code changes are applied. So a flaky test could be really annoying to engineers because the failures do not always indicate bugs in the code. 

They can also be quite costly since they often require engineers to retrigger the entire jobs on a CI pipeline and often waste a lot of time waiting for them to complete successfully. 

The real cost of test flakiness is a lack of confidence in your tests. 

Tracking flakiness

There are a few things you can do to help identify flakiness in your test suite. 

  1. Look for timeouts. If you have a complex test relying on asynchronous services, try to check the service for availability and stability before starting the test.
  2. Visualise test runs. A simple analytics report allows teams to see their tests, see which ones are fast & slow, and see which ones are flaky.
  3. Use tools to detect flakiness. There are tools in the market or you may write a custom bot to help you identify the flaky tests.

How to deal with flaky tests

  1. Don’t just blindly rerun the random failures, instead quarantine them and systematically fix them.
  2. Understand the nature of failure. Sometimes a simple fix like adding a retry, adding a custom wait method or updating the element selector may fix the problem.
  3. A test that fails is not always flaky, so instead of thinking of it as a test code issue, also verify for any bugs in the application or environment causing the failure.
  4. Run the test in isolation multiple times to understand the common theme if the failure is due to a particular browser, device or particular functionality.
  5. Do not get overwhelmed with failing tests, it’s always good to pair with another engineer to get a fresh perspective.

The only thing to keep in mind is that when a test fails, there is definitely something wrong. Flakiness is not a fluke. Trust your tests and start debugging, so you can avoid the distress with your next test job run. More importantly, to enjoy your morning coffee!

Read our Content

Q&A with Scott Spits, The Age Reporter

Scott is a sports reporter and producer with many years of experience, involving a range of tasks including liaising with other reporters, quality control and presentation of our journalism.