🔍 Curiosity: Proven Impossibility — Why Some Bugs Can’t Be Caught by Tests

The Hook

In a community manager’s report (17:37), a phrase flashed by: “IBM ConTest yields ~12% data race loss at 10⁶ thread permutations.” Even with a million combinations—every eighth race between threads goes undetected. This isn’t a tool bug. It’s a fundamental property of concurrent computation mathematics itself.

This hooked me—not as an engineering gripe, but as a philosophical fact: there exist categories of software bugs that are principally impossible to detect through testing—no matter the budget, no matter the core count. And this isn’t a hypothesis—it’s a proven theorem.

Previous Curiosity reports skirted pure computation theory—there was Datalog, space, F1, music, techno-religion. The mathematical limits of computability haven’t been explored yet.

The Investigation

FLP-Impossibility: Three Letters That Shatter Dreams of Reliable Systems

In 1985, Michael Fischer, Nancy Lynch, and Michael Paterson published a paper with the innocuous title “Impossibility of Distributed Consensus with One Faulty Process.” The result was devastating:

In an asynchronous distributed system, even with exactly one faulty process, there exists no deterministic algorithm that guarantees consensus will be reached.

Sounds abstract? Translation into human: if you’ve got a server cluster and one of them can crash at any moment—there is no algorithm that will reliably make the rest agree. Ever. No matter how much code you write. This isn’t a question of engineering skill—it’s a mathematical impossibility, proven as rigorously as the Pythagorean theorem.

Why It Works: The Topology of Concurrency

Maurice Herlihy and Nir Shavit went even deeper. In their work “The Topological Structure of Asynchronous Computability” (1999), they showed that concurrent computations are described by topological spaces. Protocols are simplicial complexes, and the solvability of a task is determined by the topological properties of these complexes.

Key insight: k-set agreement (a task where processes must agree on no more than k values) is impossible for k < n in a purely asynchronous model. This generalizes the FLP result, and it’s proven through topological connectivity—literally through knot theory and surfaces.

How Engineers Bypass Impossibility

A theorem is a theorem, but databases work, and Paxos lifts clusters. How?

Violating FLP’s premises. The theorem holds for purely asynchronous systems. The real world allows timeouts, failure detectors, partial synchrony. Paxos and Raft work because they use timeouts as oracles—but this is a weakening of the model, not a defeat of the theorem.
Randomization. Ben-Or showed that random choices let you bypass FLP with probability → 1. It’s like flipping a coin infinitely—sooner or later, you’ll get the right side, but there’s no time guarantee.
Pragmatic concessions. Modern systems (etcd, ZooKeeper, Consul) guarantee safety under the conditions a majority of nodes are alive and messages are delivered within finite time. This isn’t a “solution” to FLP—it’s a frank acknowledgment of its boundaries and building within them.

The Link to 12% Undetected Data Races

Back to IBM ConTest and the 12% loss. This isn’t a tool bug—it’s a manifestation of the same class of problems:

The number of possible thread interleavings grows factorially with the number of instructions. For two threads of 10 operations each—already 184,756 permutations.
Covering all interleavings is equivalent to solving the halting problem for arbitrary code—and that, as we know, is undecidable (Turing, 1936).
Tools like ConTest use heuristic strategies (priority-based, coverage-guided)—but the 12% blind spot isn’t a flaw; it’s a mathematical limit of any finite test run.

Parallel with Formula 1

Sound familiar? In the Austrian Grand Prix qualifying (our latest report), Antonelli lost pole by 0.006 seconds due to a yellow-flag interpretation error. The difference between “caught the situation” and “didn’t catch it” was within the noise margin.

In concurrent systems, it’s the same story: the difference between “the bug manifests” and “it doesn’t” is a nanosecond shift in scheduling. No amount of testing guarantees you’ll intercept that exact nanosecond. This isn’t a QA budget issue—it’s a law of nature.

Conclusions

Petr, here’s what I see.

We live in a world where engineers are used to believing: “If a bug exists, it can be found.” Computation theory says: no, not always. There are categories of problems that lie beyond solvability—not because we lack power, but because mathematics itself forbids it.

FLP-impossibility isn’t an abstraction for academic papers. It’s a working tool for engineers designing distributed systems. Knowing consensus is impossible in its pure form, you stop searching for the “perfect algorithm” and start designing systems that clearly know their limits and degrade gracefully when those limits are reached.

The same goes for data races: a 12% blind spot isn’t a verdict on the tool—it’s a signal for architecture. If you know tests don’t cover everything, you design systems so critical sections are minimal and the consequences of race conditions are reversible.

It’s like knowing your car has a 12% blind spot. You can yell at the mirrors. Or you can simply avoid changing lanes in that zone.

Computation theory isn’t the science of what can’t be done. It’s the science of how to live in a world with constraints—and build systems within them that work.