Computer Atlas

Formal Verification

Also known as: model checking, theorem proving, TLA+, Coq, proof assistant

supplemental advanced concept 6 min read · Updated 2026-06-11

Using mathematical proof to guarantee that a system or program meets a specification — providing stronger assurance than testing by exhaustively verifying all possible behaviours.

Primary domain
Security & Privacy
Sub-category
Cryptography & Formal Methods

In simple terms

Tests check that a program works for the inputs you tried. Formal verification proves that it works for all possible inputs — using mathematical reasoning that proves theorems. If a cryptographic protocol is formally verified, you know it is correct even in adversarial conditions that no human tester would think to try. The trade-off: verification requires a formal specification (you must write down exactly what “correct” means) and significant expert effort.

The Visual Map

flowchart TD
  SPEC["Formal specification\n(what 'correct' means)"]
  IMPL["System / code"]
  SPEC --> MC["Model checking\n(TLA+, Alloy, SPIN)\nExhaustive state enumeration"]
  SPEC --> TP["Theorem proving\n(Coq, Isabelle, Lean)\nHuman-guided proof steps"]
  SPEC --> SMT["SMT solvers\n(Z3, CVC5)\nSAT/arithmetic decision"]
  MC --> |"finds counterexample or proves holds"| R1["Invariants verified\nor bug found"]
  TP --> |"proof checked by kernel"| R2["Functional correctness\nproved"]
  SMT --> |"bounded verification"| R3["Bounded correctness\nor constraint violation"]
  IMPL --> MC & TP & SMT

More detail

Model checking: exhaustively enumerate all reachable states of a finite-state system and check that invariants hold. Tools: TLA+ (Leslie Lamport), Alloy, SPIN. Scales to ~10¹⁰ states using symbolic techniques (BDD, SAT). AWS uses TLA+ to verify distributed system designs — it found critical bugs in S3 and EBS designs before implementation.

Theorem proving (interactive): a human works with a proof assistant (Coq, Isabelle, Lean, Agda) to construct a formal proof; the proof checker verifies every step. Can handle infinite-state systems and complex mathematics. The seL4 microkernel was fully verified in Isabelle: its C implementation is proved correct with respect to its abstract specification and memory-safe (no buffer overflow, no use-after-free) by construction.

Automated theorem proving (SMT solvers): Z3, CVC5 — decide the satisfiability of logical formulas over arithmetic, arrays, and bit-vectors. Used by CBMC (bounded model checker for C), Dafny (verifying programming language), and AWS Zelkova (IAM policy analysis).

What you can verify:

  • Safety — “the system never enters a bad state” (no deadlock, no privilege escalation).
  • Liveness — “the system eventually reaches a good state” (every request is eventually served).
  • Functional correctness — “this sort always returns a sorted permutation of the input.”
  • Security properties — “the key is never revealed to an attacker” (proved for TLS 1.3 handshake in ProVerif/CryptoVerif).

Under the Hood

Manual model checking: BFS over all reachable states of a simple mutual-exclusion protocol, verifying the invariant holds everywhere:

from collections import deque

# State: (lock_holder, waiting_A, waiting_B)
# lock_holder: None, 'A', or 'B'
INITIAL = (None, False, False)

def transitions(state):
    lock, wait_A, wait_B = state
    nexts = []
    # A tries to acquire
    if not wait_A and lock is None:
        nexts.append(('A', True, wait_B))         # A gets lock
    elif not wait_A:
        nexts.append((lock, True, wait_B))         # A queues
    # A releases
    if lock == 'A':
        nexts.append((None, False, wait_B))
    # B tries to acquire
    if not wait_B and lock is None:
        nexts.append((lock, wait_A, True))         # B gets lock
    elif not wait_B:
        nexts.append((lock, wait_A, True))
    # B releases
    if lock == 'B':
        nexts.append((None, wait_A, False))
    return nexts

def invariant(state):
    lock, wait_A, wait_B = state
    return not (lock == 'A' and lock == 'B')       # mutual exclusion

visited, queue, violated = set(), deque([INITIAL]), False
while queue:
    s = queue.popleft()
    if s in visited: continue
    visited.add(s)
    if not invariant(s):
        violated = True; break
    for t in transitions(s): queue.append(t)

print(f"States explored:    {len(visited)}")
print(f"Invariant violated: {violated}")
print(f"Mutual exclusion:   {'PROVED' if not violated else 'BROKEN'}")

Engineering Trade-offs

  • Testing vs formal verification. Tests find bugs; formal verification proves their absence. Verification costs 5–20× more per line of code but provides guarantees impossible with testing — valuable for cryptographic implementations, safety-critical code, and security kernels.
  • Model checking vs theorem proving. Model checking is push-button for finite systems but hits a state-space explosion for complex systems. Theorem proving handles infinite systems and complex properties but requires expert human effort. SMT solvers occupy the middle: powerful for bounded verification, limited for unbounded proofs.
  • Spec quality is the limit. “Formally verified” means “meets the specification”. A wrong or incomplete spec leaves bugs undetected. Specifying what “correct” means is often as hard as writing the code itself — and writing the spec down forces clarity about requirements.
  • Continuous vs milestone use. TLA+ for design-level verification of a protocol catches bugs before implementation at low cost. Full Isabelle/Coq proofs for a kernel take person-years. The right tool depends on the assurance requirement and the budget.

Real-world examples

  • AWS uses TLA+ to verify distributed service designs; it found critical bugs in EBS and S3 before a single line of production code was written.
  • seL4 — the world’s most analysed OS kernel — is used in drones, automotive systems, and classified government infrastructure.
  • TLS 1.3 was designed alongside a formal security proof in ProVerif and CryptoVerif — an unusual case where the protocol and its proof were developed in parallel.
  • Ethereum smart contracts are verified using Certora Prover and the K framework to prevent multi-million-dollar exploits.

Common misconceptions

  • “Formal verification means the software has no bugs.” It means the implementation meets the specification. If the spec is wrong or incomplete, bugs remain. “Verified” always means “verified against a particular specification.”
  • “Formal verification is only for academics.” AWS, Microsoft (Hyper-V), Google (Android Verified Boot), and DARPA programmes routinely use formal methods in production-critical systems.

Try it yourself

BFS model checking: verify a mutual-exclusion protocol holds over all reachable states:

python3 -c "
from collections import deque

# States: idle, locked_A, locked_B, locked_AB (the forbidden state)
# Transitions: A or B can acquire/release the lock
# Invariant: 'locked_AB' is never reachable

def transitions(state):
    if state == 'idle':
        return ['locked_A', 'locked_B']  # either can acquire
    if state == 'locked_A':
        return ['idle', 'locked_AB']     # A releases OR (bad impl) B also acquires
    if state == 'locked_B':
        return ['idle', 'locked_BA']     # B releases OR (bad impl) A also acquires
    return []

# Safe version: proper mutex — B waits, cannot acquire while A holds
def safe_transitions(state):
    if state == 'idle':
        return ['locked_A', 'locked_B']
    if state == 'locked_A':
        return ['idle']   # only A can release; B blocked
    if state == 'locked_B':
        return ['idle']   # only B can release; A blocked
    return []

visited, q = set(), deque(['idle'])
while q:
    s = q.popleft()
    if s in visited: continue
    visited.add(s)
    for t in safe_transitions(s): q.append(t)

print(f'States reachable: {visited}')
print(f'Mutual exclusion holds: {\"locked_AB\" not in visited and \"locked_BA\" not in visited}')
print(f'States explored: {len(visited)} (state space is exhausted)')
"

Learn next

  • Cryptography — formal verification is how we know cryptographic protocols like TLS 1.3 are correct.
  • Type theory — the logical foundation; dependent types bring formal verification into everyday programming.
  • Lambda calculus — the mathematical basis for proof assistants like Coq and Lean.

Neighborhood

A visual companion to the relationships above. Click any node to visit that topic.