Don't test what you don't control | Christian Rackerseder
Don't test what you don't control<br>Christian Rackerseder<br>By Christian Rackerseder
Published 2026-06-18<br>in Testing 9 min read
Permalink Automated tests are supposed to create trust.
A red test should be actionable. It should tell a team that a change broke behavior the team cares about and can fix.
That signal only works when the test environment is controlled enough to make the failure meaningful. The moment a required test depends on a deployed service the team does not own, the signal becomes ambiguous. Maybe our code is broken, but maybe the identity service is down, staging was redeployed, a test account expired, or someone changed shared test data.
The test is red, but the team cannot fix the reason. That is not a good quality gate. It is organizational noise with a test runner around it.
This is not an argument against integration tests. It is not an argument against end-to-end tests either.
It is an argument against testing systems you do not control as if you control them.
The problem is not integration testing #
Integration tests are useful. End-to-end tests are useful. Testing real behavior across real boundaries is useful.
The problem starts when teams call something an integration test, but the test actually depends on half of the company being healthy at the same time. That may still be a valuable test, but it is a different kind of test. It should not have the same responsibility, cadence or ownership as a feature team’s pull request checks.
A pull request test should answer a specific question: did this change break the behavior this team owns?
A company-level system test answers a different question: does this whole system landscape still work together right now?
Those questions are related, but they are not the same. When we confuse them, we create fragile pipelines, noisy teams and quality gates that people slowly stop trusting.
End-to-end ownership has borders #
I have written before that DevOps is a skill, not a role. I still believe that.
Teams should understand deployment, operations, monitoring, incident behavior and production impact. They should not throw code over a wall and pretend production is someone else’s problem.
But end-to-end ownership does not mean unlimited ownership. A product team does not own every service, every provider, every shared environment and every dependency in the company. A team owns what it can change, deploy, observe and fix.
That is the border.
Tests should respect that border. The moment a required test crosses into a system the team cannot control, the test has changed category. It is no longer only a feature team test. It is a system test.
That does not make it bad. It makes it different.
Ownership is the real test boundary #
We often classify tests by shape: unit test, integration test, end-to-end test. That is useful, but incomplete.
The more important question is: what can make this test fail?
If the answer includes a service your team cannot change, cannot deploy, cannot reset and cannot debug directly, you crossed an ownership boundary. That boundary matters more than the label of the test.
Before making a test a required gate, ask whether the team can make the dependency deterministic, reset the state, deploy the required version and observe failures without waiting for another team. Also ask whether the pull request author can fix the failure inside the owned codebase or owned runtime.
When the answer is no, be careful. You may still want the test, but you probably do not want it as a required signal for every product change.
The wrong kind of red #
Imagine a team owns a web application and one backend service. A pull request changes the validation rules for a profile form.
The end-to-end test opens the application, signs in through a real identity service, writes data through the team’s service, calls another team’s notification service and waits for an email provider. The test fails.
Maybe the identity service is slow. Maybe the notification service was redeployed. Maybe the email provider accepted the request but delayed delivery. Maybe staging contains broken shared data.
None of that proves that the profile form change is wrong.
The failure may still describe a real problem in the overall system. But it is the wrong kind of red for that pull request. It does not point clearly at the changed code, it does not create fast feedback, and it turns a product change into investigation work.
That is expensive. Not because investigation is bad, but because the signal was attached to the wrong owner.
Test behavior, not dependency availability #
If your code talks to another service, you still need confidence. But confidence does not require every pull request to call the deployed service.
At the boundary, test the behavior you own. Test the request you create, the response you accept, the error cases you handle, and the way your product behaves when the...