Learning Regular Languages with the TTT Algorithm

TLDR; This tutorial is a complete implementation of the TTT algorithm for active automata learning in Python. TTT combines the discrimination tree of Kearns and Vazirani with binary search counterexample analysis from Rivest and Schapire, and adds prefix transformation and discriminator finalization to eliminate all redundant membership queries. The Python interpreter is embedded so that you can work through the implementation steps.

In my previous post, I implemented Angluin’s L* algorithm for learning regular languages from a blackbox oracle. L* uses a flat observation table to track state distinctions, which leads to redundant membership queries: when a counterexample arrives, all its suffixes are added as columns even though most distinguish no new states.

The key insight that the discrimination tree is a better data structure for this job was due to Kearns and Vazirani 1, who replaced L*’s observation table with a binary tree of discriminators. Rivest and Schapire 2 independently contributed binary search counterexample analysis, which finds the single relevant suffix in a counterexample in \(O(\log k)\) queries rather than adding all \(k\) suffixes.

TTT by Isberner, Howar and Steffen 3 adds two further refinements: prefix transformation, which keeps access sequences minimal, and discriminator finalization, which keeps the discrimination tree shallow. Together these make TTT provably redundancy-free. That is, it never makes a membership query whose answer could have been derived from earlier queries.

TTT is the algorithm of choice in practical automata learning tools such as LearnLib 4. ADT 5 extends TTT with adaptive distinguishing sequences, which can reduce resets in hardware settings, though performance differences in software engineering settings are modest.

Definitions

Alphabet \(A\): the set of input symbols the DFA reads.

Membership query: a string passed to the blackbox oracle. The oracle answers yes (accepted) or no (rejected).

Equivalence query: a hypothesis grammar passed to the teacher. The teacher answers yes, or returns a counterexample string where the hypothesis and the target disagree.

PAC oracle: a probabilistic approximation to the equivalence oracle. After \(N\) random tests without finding a counterexample, we declare the hypothesis probably approximately correct.

Discrimination tree (DT): a binary tree whose inner nodes are discriminator suffixes and whose leaves are states. Sifting a string \(w\) through the tree classifies it to a state using one membership query per level.

Access sequence \(acc(q)\): the shortest known string that reaches state \(q\) in the target.

Spanning tree: a mapping from each known state to its access sequence. A dict from state to the shortest string known to reach it.

Open transition: a transition from state \(q\) on symbol \(a\) whose target state has no access sequence yet, meaning TTT has not yet determined which state it leads to.

Counterexample decomposition: the process of finding the split point in a counterexample, extracting a new discriminator, and splitting a leaf in the DT.

Contents

Definitions

Prerequisites

The Road from L* to TTT

The DFA Representation

The Oracle

The Discrimination Tree Sifting

The Spanning Tree

Hypothesis Construction Incremental Hypothesis Update

Counterexample Decomposition The Split Point

Prefix Transformation

Splitting a Leaf

Discriminator Finalization

Putting Decomposition Together

Non-Redundancy

A Note on the Equivalence Oracle

DT Coherence After Split

The Main Loop

Examples

Evaluating Model Accuracy

Comparison with L*

References

Artifacts

Important: Pyodide takes time to initialize. Initialization completion is indicated by a red border around Run all button.

Run all

Prerequisites

We use the Teacher and Oracle from the L* post unchanged. The PAC equivalence oracle in Teacher is a direct drop-in for TTT. The rest of the algorithm is completely independent of how equivalence queries are answered.

Available Packages

These are packages that refer either to my previous posts or to pure python packages that I have compiled, and is available in the below locations. As before, install them if you need to run the program directly on the machine. To install, simply download the wheel file (`pkg.whl`) and install using `pip install pkg.whl`.

simplefuzzer-0.0.1-py2.py3-none-any.whl from "The simplest grammar fuzzer in the world".

rxfuzzer-0.0.1-py2.py3-none-any.whl from "Fuzzing With Regular Expressions".

earleyparser-0.0.1-py2.py3-none-any.whl from "Earley Parser".

cfgrandomsample-0.0.1-py2.py3-none-any.whl from "Uniform Random Sampling of Strings from Context-Free Grammar".

cfgremoveepsilon-0.0.1-py2.py3-none-any.whl from "Remove Empty (Epsilon) Rules From a Context-Free Grammar.".

gatleastsinglefault-0.0.1-py2.py3-none-any.whl from "Specializing Context-Free Grammars for Inducing Faults".

hdd-0.0.1-py2.py3-none-any.whl...

Learning Regular Languages with the TTT Algorithm

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs