Documentation

Atlas.TheoryOfComputation.code.RegularLanguages

@[reducible, inline]

abbrev FiniteAutomaton (α : Type u_1) (σ : Type u_2) :

Type (max u_1 u_2)

A finite automaton over alphabet α with state space σ, defined as a DFA α σ.

In Sipser's textbook, a finite automaton is a 5-tuple (Q, Σ, δ, q₀, F) where Q is a finite set of states, Σ a finite alphabet, δ : Q × Σ → Q the transition function, q₀ the start state, and F ⊆ Q the accept states. This is exactly the data packaged by Mathlib's DFA.

Instances For

def DFA.language {α : Type u_1} {σ : Type u_2} (M : DFA α σ) :

The language recognized by a DFA M: the set of strings M accepts.

Instances For

def RegularLanguages.IsRegular {α : Type u_1} (L : Language α) :

A language L is regular if some finite automaton recognizes it.

Equivalent to Mathlib's Language.IsRegular. This matches Sipser's definition: L is regular iff L = L(M) for some DFA M.

Instances For

theorem RegularLanguages.regular_union {α : Type u_1} {A B : Language α} (hA : IsRegular A) (hB : IsRegular B) :

IsRegular (A + B)

Closure of regular languages under union.

If A and B are regular languages over the same alphabet, then so is their union A ∪ B (written A + B). The proof takes DFAs M₁ and M₂ recognizing A and B respectively and uses the product DFA M₁.union M₂ whose state space is σ₁ × σ₂.

theorem RegularLanguages.kstar_append {α : Type u_1} {A : Language α} {x y : List α} (hx : x ∈ KStar.kstar A) (hy : y ∈ KStar.kstar A) :

x ++ y ∈ KStar.kstar A

The Kleene star A* is closed under concatenation: if x, y ∈ A* then x ++ y ∈ A*. Combine the witnessing factorizations of x and y.

theorem RegularLanguages.kstar_cons {α : Type u_1} {A : Language α} {w y : List α} (hw : w ∈ A) (hy : y ∈ KStar.kstar A) :

w ++ y ∈ KStar.kstar A

Prepending a word w ∈ A to y ∈ A* stays in A*.

theorem RegularLanguages.kstar_split {α : Type u_1} {A : Language α} {x y : List α} (h : x ++ y ∈ KStar.kstar A) :

x ∈ KStar.kstar A ∧ y ∈ KStar.kstar A ∨ ∃ (p : List α) (c : List α) (d : List α) (r : List α), p ∈ KStar.kstar A ∧ x = p ++ c ∧ c ≠ [] ∧ c ++ d ∈ A ∧ r ∈ KStar.kstar A ∧ y = d ++ r

Splitting lemma for Kleene star.

If x ++ y ∈ A*, then either

both x ∈ A* and y ∈ A* (the split between x and y aligns with a word boundary in some A*-factorization), or
the boundary cuts through a single word of A: there exist p ∈ A*, nonempty c, and d, r with x = p ++ c, c ++ d ∈ A, r ∈ A* and y = d ++ r.

theorem RegularLanguages.kstar_leftQuotient_transfer {α : Type u_1} {A : Language α} {x₁ x₂ y : List α} (hS : A.leftQuotient '' {c : List α | ∃ p ∈ KStar.kstar A, x₁ = p ++ c ∧ c ≠ []} ⊆ A.leftQuotient '' {c : List α | ∃ p ∈ KStar.kstar A, x₂ = p ++ c ∧ c ≠ []}) (hB : x₁ ∈ KStar.kstar A → x₂ ∈ KStar.kstar A) (hy : x₁ ++ y ∈ KStar.kstar A) :

x₂ ++ y ∈ KStar.kstar A

Transfer membership in A* between prefixes that have the same set of "residual left-quotients" of nonempty pending words.

If every left-quotient A / c realized by a nonempty residual c of x₁ (with some p ∈ A* so that x₁ = p ++ c) is also realized by x₂, and x₂ ∈ A* whenever x₁ ∈ A*, then x₁ ++ y ∈ A* implies x₂ ++ y ∈ A*. This is the core step in proving that A* has finitely many left-quotients.

theorem RegularLanguages.leftQuotient_kstar_determined {α : Type u_1} (A : Language α) (x₁ x₂ : List α) (hS : A.leftQuotient '' {c : List α | ∃ p ∈ KStar.kstar A, x₁ = p ++ c ∧ c ≠ []} = A.leftQuotient '' {c : List α | ∃ p ∈ KStar.kstar A, x₂ = p ++ c ∧ c ≠ []}) (hB : (x₁ ∈ KStar.kstar A) = (x₂ ∈ KStar.kstar A)) :

(KStar.kstar A).leftQuotient x₁ = (KStar.kstar A).leftQuotient x₂

If x₁ and x₂ induce the same set of residual A-left-quotients and the same membership in A*, then their left-quotients under A* agree: A* / x₁ = A* / x₂. This is the equivalence relation underlying the Myhill–Nerode-style proof that A* is regular.

theorem RegularLanguages.finite_range_prop {X : Type u_1} (f : X → Prop) :

(Set.range f).Finite

The range of any function f : X → Prop is finite (it has at most two elements, True and False, by propositional extensionality).

theorem RegularLanguages.regular_star {α : Type u_1} {A : Language α} (hA : A.IsRegular) :

(KStar.kstar A).IsRegular

Closure of regular languages under Kleene star.

If A is a regular language then so is A* = A.kstar. The proof uses the Myhill–Nerode characterization: A* has finitely many left-quotients because each one is determined by the pair (set of residual A-left-quotients, membership of x in A*), and both components live in finite sets when A is regular.

theorem RegularLanguages.DFA.pumping_lemma' {α : Type u_1} {σ : Type u_2} (M : DFA α σ) [Fintype σ] {s : List α} (hs : s ∈ M.accepts) (hlen : Fintype.card σ ≤ s.length) :

∃ (x : List α) (y : List α) (z : List α), s = x ++ y ++ z ∧ x.length + y.length ≤ Fintype.card σ ∧ y ≠ [] ∧ {x} * KStar.kstar {y} * {z} ≤ M.accepts

Pumping lemma at the DFA level.

If a DFA M with finite state space σ accepts a string s of length at least |σ|, then s factors as s = x ++ y ++ z with |xy| ≤ |σ|, y ≠ [], and {x} · {y}* · {z} ⊆ L(M).

theorem RegularLanguages.pumping_lemma {α : Type u_1} {A : Language α} (hA : IsRegular A) :

∃ (p : ℕ), 0 < p ∧ ∀ s ∈ A, p ≤ s.length → ∃ (x : List α) (y : List α) (z : List α), s = x ++ y ++ z ∧ x.length + y.length ≤ p ∧ y ≠ [] ∧ {x} * KStar.kstar {y} * {z} ≤ A

Pumping lemma for regular languages.

For every regular language A, there is a pumping length p > 0 such that any s ∈ A with |s| ≥ p factors as s = xyz with

xyⁱz ∈ A for all i ≥ 0 (encoded here as {x} · {y}* · {z} ⊆ A),
y ≠ ε,
|xy| ≤ p.

The pumping length is taken to be the cardinality of the state space of a DFA recognizing A.

def RegularLanguages.concat {α : Type u_1} (A B : Language α) :

Concatenation of two languages: A · B = { ab | a ∈ A, b ∈ B }, realized in Mathlib as A * B.

Instances For

theorem RegularLanguages.leftQuotient_mul_determined {α : Type u_1} (A B : Language α) (x₁ x₂ : List α) (h1 : A.leftQuotient x₁ = A.leftQuotient x₂) (h2 : B.leftQuotient '' {c : List α | ∃ a ∈ A, x₁ = a ++ c} = B.leftQuotient '' {c : List α | ∃ a ∈ A, x₂ = a ++ c}) :

(A * B).leftQuotient x₁ = (A * B).leftQuotient x₂

The left-quotient of A * B by x is determined by the pair (A.leftQuotient x, { B.leftQuotient c | a ∈ A, x = a ++ c }).

If two strings x₁, x₂ agree in both components then they have the same left-quotient under the concatenation A * B. This is the key lemma for showing A * B is regular via Myhill–Nerode.

theorem RegularLanguages.regular_concat {α : Type u_1} {A B : Language α} (hA : A.IsRegular) (hB : B.IsRegular) :

(A * B).IsRegular

Closure of regular languages under concatenation.

If A and B are regular then so is A · B = A * B. The proof uses the Myhill–Nerode characterization, showing that A * B has finitely many left-quotients because each is determined by a pair lying in the product of two finite sets.