The Anatomy of Test Flakiness: A Systematic Approach

Test flakiness is the slow death of a test suite. It starts innocuously — one or two tests that "sometimes fail" — and ends with a team that ignores CI failures because "it's probably just that flaky test again."

At Worknoobs, I walked into exactly that situation.

Diagnosing the Problem

My first step was never to fix anything. It was to *measure* everything:

- Failure frequency per test (over 30 CI runs)

- Failure mode categorisation (timeout, assertion, element not found, network)

- Correlation between failures (did they fail together? Different workers?)

This immediately revealed three categories of root cause:

1. Async Payment State Transitions

The most common failure pattern was tests asserting on payment status before the backend had finished processing. The tests were using arbitrary waitForTimeout(2000) calls — a classic anti-pattern.

The fix was replacing all arbitrary waits with proper condition-based polling:

// Before — fragile
await page.waitForTimeout(2000);
await expect(statusBadge).toHaveText('Confirmed');

// After — robust
await expect(statusBadge).toHaveText('Confirmed', { timeout: 15000 });

2. Shared Auth State Between Workers

The test suite was using a single auth file shared across parallel workers. Worker 2 would invalidate Worker 1's session mid-test by triggering a conflicting login.

The solution was a worker-slot namespacing system — each parallel Cucumber worker gets its own pre-authenticated session file, keyed by CUCUMBER_WORKER_ID. No more session conflicts.

3. Dynamic UI Renders

Product listings were being loaded asynchronously, and tests were selecting by index (nth(0)) before the list had fully populated. The fix was to always wait for a specific, stable element count before proceeding.

The Result

In 6 weeks, flakiness went from endemic to near-zero. More importantly, the team started trusting CI again — which is the real metric that matters.

Emmanuel Eko

SDET & QA Architect