PHP: random failures in tests

Context: Tests have been working fine so far, and, suddenly, it seems to fail. Why? While it's easy to write basic unit tests, it can be tricky when dealing with dates, complex calculations, or random values. Errors with dates and intervals While it's usually a good practice to stick with the native API (regardless of the programming language), I recommend Carbon, especially in unit tests: it just works it's more readable it's more precise it's only an additional layer, as it extends the built-in DateTime it simplifies tests it's already packed in major frameworks such as Laravel You may prevent funny situations with dates: mismatch due to server configuration: pipelines (CI/CD) vs. local machines mismatch due to overflows: assertions fail because date intervals do not match this month mismatch due to timezone differences and microseconds mismatch due to time variability: your tests depend on the current date but the current date is not mocked correctly Testable code vs. complex logic The business logic is the heart of your job. Such logic often needs to cover various cases, which can be hard to implement. Because you have to make it work, you will get there eventually, but the code might not be maintainable. Writing testable code simplifies tests drastically (e.g., SOLID principles, DRY, composition over inheritance). Here are some red signs: the tests replicate the implementation while it should focus on behaviors instead race conditions: your logic involve concurrent operations that fail randomly Property-based testing: intermittent failures Property-based testing is a popular approach. You can use randomized inputs to cover more cases. In other words, you don't write values/examples manually. Every time, the computer generates a new set of inputs. This is pretty cool but ensure you do the following before: think about the specification very carefully fine tune properties according to your business logic (e.g., unsupported values, scopes) Besides, be aware it does not replace unit tests (example-based tests), so do not rely solely on property-based testing. Otherwise, you may encounter the following issues: wrong or incomplete notion of correctness overuse leading to slower dev performance issues Besides, relying too much on randomized inputs can lead to a false sense of security. It can be hard to cover all scenarios. Manual "hacks" Of course it's bad, but it's not uncommon to find some hacks, especially in legacy projects. These hacks do the job but aren't maintainable. For example: sleep(9); The above code defines an arbitrary wait time of 9 seconds and may allow you to "mock" some real behavior. However, it's not reliable and may lead to random failures in your pipelines. The Faker trap Faker is a very popular library to generate fake data: $email1 = $faker->email(); $email2 = $faker->email(); In the above code, $email1 and $email2 can be the same sometimes, making your assertion fail randomly. In this case, using Faker is probably not mandatory (particularly for diff). If you want to test a set of inputs you may use data providers instead. Array comparisons You may encounter random errors when testing that two arrays are equal or not equal. That's often because one of the two arrays is not sorted: $this->assertEquals([2, 1], [1, 2]); PHPUnit has specific methods to assess equality. Just pick the right one. For example, assertEqualsCanonicalizing will sort arrays automatically before comparing them. Be careful, though. It does not mean you can use these methods to make your tests pass while it should not. Sometimes, you do want your arrays to be exactly the same. Wrong execution order This one can be a huge pain! The execution order of your tests should not affect the outcome, but it's not uncommon to find such configuration in projects. You may want to use @depends annotation to "fix" the problem, but it's best if you can refactor your tests to be independent instead. Troubleshooting: help yourself There are simple measures you can take to ease the pain with your tests: keep it simple (which is not stupid at all) while CI/CD pipelines can fail, it runs automatically on pushes, merges, or manual deployments the AAA pattern improves readability use randomness judiciously name your methods carefully reduce unnecessary complexity, including useless dependencies refactor your tests on a regular basis include tests in code reviews Wrap up If your tests fail randomly, it's usually due to a known issue. However, there are some traps to avoid, especially when you introduce some randomness on purpose.

Apr 1, 2025 - 12:15

Context:

Tests have been working fine so far, and, suddenly, it seems to fail.

Why?

While it's easy to write basic unit tests, it can be tricky when dealing with dates, complex calculations, or random values.

Errors with dates and intervals

While it's usually a good practice to stick with the native API (regardless of the programming language), I recommend Carbon, especially in unit tests:

it just works
it's more readable
it's more precise
it's only an additional layer, as it extends the built-in DateTime
it simplifies tests
it's already packed in major frameworks such as Laravel

You may prevent funny situations with dates:

mismatch due to server configuration: pipelines (CI/CD) vs. local machines
mismatch due to overflows: assertions fail because date intervals do not match this month
mismatch due to timezone differences and microseconds
mismatch due to time variability: your tests depend on the current date but the current date is not mocked correctly

Testable code vs. complex logic

The business logic is the heart of your job.

Such logic often needs to cover various cases, which can be hard to implement.

Because you have to make it work, you will get there eventually, but the code might not be maintainable.

Writing testable code simplifies tests drastically (e.g., SOLID principles, DRY, composition over inheritance).

Here are some red signs:

the tests replicate the implementation while it should focus on behaviors instead
race conditions: your logic involve concurrent operations that fail randomly

Property-based testing: intermittent failures

Property-based testing is a popular approach.

You can use randomized inputs to cover more cases.

In other words, you don't write values/examples manually. Every time, the computer generates a new set of inputs.

This is pretty cool but ensure you do the following before:

think about the specification very carefully
fine tune properties according to your business logic (e.g., unsupported values, scopes)

Besides, be aware it does not replace unit tests (example-based tests), so do not rely solely on property-based testing.

Otherwise, you may encounter the following issues:

wrong or incomplete notion of correctness
overuse leading to slower dev
performance issues

Besides, relying too much on randomized inputs can lead to a false sense of security.

It can be hard to cover all scenarios.

Manual "hacks"

Of course it's bad, but it's not uncommon to find some hacks, especially in legacy projects.

These hacks do the job but aren't maintainable.

For example:

sleep(9);

The above code defines an arbitrary wait time of 9 seconds and may allow you to "mock" some real behavior.

However, it's not reliable and may lead to random failures in your pipelines.

The `Faker` trap

Faker is a very popular library to generate fake data:

$email1 = $faker->email();
$email2 = $faker->email();

In the above code, $email1 and $email2 can be the same sometimes, making your assertion fail randomly.

In this case, using Faker is probably not mandatory (particularly for diff).

If you want to test a set of inputs you may use data providers instead.

Array comparisons

You may encounter random errors when testing that two arrays are equal or not equal.

That's often because one of the two arrays is not sorted:

$this->assertEquals([2, 1], [1, 2]);

PHPUnit has specific methods to assess equality.

Just pick the right one. For example, assertEqualsCanonicalizing will sort arrays automatically before comparing them.

Be careful, though. It does not mean you can use these methods to make your tests pass while it should not.

Sometimes, you do want your arrays to be exactly the same.

Wrong execution order

This one can be a huge pain!

The execution order of your tests should not affect the outcome, but it's not uncommon to find such configuration in projects.

You may want to use @depends annotation to "fix" the problem, but it's best if you can refactor your tests to be independent instead.

Troubleshooting: help yourself

There are simple measures you can take to ease the pain with your tests:

keep it simple (which is not stupid at all)
while CI/CD pipelines can fail, it runs automatically on pushes, merges, or manual deployments
the AAA pattern improves readability
use randomness judiciously
name your methods carefully
reduce unnecessary complexity, including useless dependencies
refactor your tests on a regular basis
include tests in code reviews

Wrap up

If your tests fail randomly, it's usually due to a known issue.

However, there are some traps to avoid, especially when you introduce some randomness on purpose.