Documentation

Atlas.TheoryOfComputation.code.RegularLanguages

@[reducible, inline]
abbrev FiniteAutomaton (α : Type u_1) (σ : Type u_2) :
Type (max u_1 u_2)

A finite automaton over alphabet α with state space σ, defined as a DFA α σ.

In Sipser's textbook, a finite automaton is a 5-tuple (Q, Σ, δ, q₀, F) where Q is a finite set of states, Σ a finite alphabet, δ : Q × Σ → Q the transition function, q₀ the start state, and F ⊆ Q the accept states. This is exactly the data packaged by Mathlib's DFA.

Instances For
    def DFA.language {α : Type u_1} {σ : Type u_2} (M : DFA α σ) :

    The language recognized by a DFA M: the set of strings M accepts.

    Instances For
      def RegularLanguages.IsRegular {α : Type u_1} (L : Language α) :

      A language L is regular if some finite automaton recognizes it.

      Equivalent to Mathlib's Language.IsRegular. This matches Sipser's definition: L is regular iff L = L(M) for some DFA M.

      Instances For
        theorem RegularLanguages.regular_union {α : Type u_1} {A B : Language α} (hA : IsRegular A) (hB : IsRegular B) :
        IsRegular (A + B)

        Closure of regular languages under union.

        If A and B are regular languages over the same alphabet, then so is their union A ∪ B (written A + B). The proof takes DFAs M₁ and M₂ recognizing A and B respectively and uses the product DFA M₁.union M₂ whose state space is σ₁ × σ₂.

        theorem RegularLanguages.kstar_append {α : Type u_1} {A : Language α} {x y : List α} (hx : x KStar.kstar A) (hy : y KStar.kstar A) :

        The Kleene star A* is closed under concatenation: if x, y ∈ A* then x ++ y ∈ A*. Combine the witnessing factorizations of x and y.

        theorem RegularLanguages.kstar_cons {α : Type u_1} {A : Language α} {w y : List α} (hw : w A) (hy : y KStar.kstar A) :

        Prepending a word w ∈ A to y ∈ A* stays in A*.

        theorem RegularLanguages.kstar_split {α : Type u_1} {A : Language α} {x y : List α} (h : x ++ y KStar.kstar A) :
        x KStar.kstar A y KStar.kstar A ∃ (p : List α) (c : List α) (d : List α) (r : List α), p KStar.kstar A x = p ++ c c [] c ++ d A r KStar.kstar A y = d ++ r

        Splitting lemma for Kleene star.

        If x ++ y ∈ A*, then either

        • both x ∈ A* and y ∈ A* (the split between x and y aligns with a word boundary in some A*-factorization), or
        • the boundary cuts through a single word of A: there exist p ∈ A*, nonempty c, and d, r with x = p ++ c, c ++ d ∈ A, r ∈ A* and y = d ++ r.
        theorem RegularLanguages.kstar_leftQuotient_transfer {α : Type u_1} {A : Language α} {x₁ x₂ y : List α} (hS : A.leftQuotient '' {c : List α | pKStar.kstar A, x₁ = p ++ c c []} A.leftQuotient '' {c : List α | pKStar.kstar A, x₂ = p ++ c c []}) (hB : x₁ KStar.kstar Ax₂ KStar.kstar A) (hy : x₁ ++ y KStar.kstar A) :
        x₂ ++ y KStar.kstar A

        Transfer membership in A* between prefixes that have the same set of "residual left-quotients" of nonempty pending words.

        If every left-quotient A / c realized by a nonempty residual c of x₁ (with some p ∈ A* so that x₁ = p ++ c) is also realized by x₂, and x₂ ∈ A* whenever x₁ ∈ A*, then x₁ ++ y ∈ A* implies x₂ ++ y ∈ A*. This is the core step in proving that A* has finitely many left-quotients.

        theorem RegularLanguages.leftQuotient_kstar_determined {α : Type u_1} (A : Language α) (x₁ x₂ : List α) (hS : A.leftQuotient '' {c : List α | pKStar.kstar A, x₁ = p ++ c c []} = A.leftQuotient '' {c : List α | pKStar.kstar A, x₂ = p ++ c c []}) (hB : (x₁ KStar.kstar A) = (x₂ KStar.kstar A)) :

        If x₁ and x₂ induce the same set of residual A-left-quotients and the same membership in A*, then their left-quotients under A* agree: A* / x₁ = A* / x₂. This is the equivalence relation underlying the Myhill–Nerode-style proof that A* is regular.

        theorem RegularLanguages.finite_range_prop {X : Type u_1} (f : XProp) :

        The range of any function f : X → Prop is finite (it has at most two elements, True and False, by propositional extensionality).

        theorem RegularLanguages.regular_star {α : Type u_1} {A : Language α} (hA : A.IsRegular) :

        Closure of regular languages under Kleene star.

        If A is a regular language then so is A* = A.kstar. The proof uses the Myhill–Nerode characterization: A* has finitely many left-quotients because each one is determined by the pair (set of residual A-left-quotients, membership of x in A*), and both components live in finite sets when A is regular.

        theorem RegularLanguages.DFA.pumping_lemma' {α : Type u_1} {σ : Type u_2} (M : DFA α σ) [Fintype σ] {s : List α} (hs : s M.accepts) (hlen : Fintype.card σ s.length) :
        ∃ (x : List α) (y : List α) (z : List α), s = x ++ y ++ z x.length + y.length Fintype.card σ y [] {x} * KStar.kstar {y} * {z} M.accepts

        Pumping lemma at the DFA level.

        If a DFA M with finite state space σ accepts a string s of length at least |σ|, then s factors as s = x ++ y ++ z with |xy| ≤ |σ|, y ≠ [], and {x} · {y}* · {z} ⊆ L(M).

        theorem RegularLanguages.pumping_lemma {α : Type u_1} {A : Language α} (hA : IsRegular A) :
        ∃ (p : ), 0 < p sA, p s.length∃ (x : List α) (y : List α) (z : List α), s = x ++ y ++ z x.length + y.length p y [] {x} * KStar.kstar {y} * {z} A

        Pumping lemma for regular languages.

        For every regular language A, there is a pumping length p > 0 such that any s ∈ A with |s| ≥ p factors as s = xyz with

        1. xyⁱz ∈ A for all i ≥ 0 (encoded here as {x} · {y}* · {z} ⊆ A),
        2. y ≠ ε,
        3. |xy| ≤ p.

        The pumping length is taken to be the cardinality of the state space of a DFA recognizing A.

        def RegularLanguages.concat {α : Type u_1} (A B : Language α) :

        Concatenation of two languages: A · B = { ab | a ∈ A, b ∈ B }, realized in Mathlib as A * B.

        Instances For
          theorem RegularLanguages.leftQuotient_mul_determined {α : Type u_1} (A B : Language α) (x₁ x₂ : List α) (h1 : A.leftQuotient x₁ = A.leftQuotient x₂) (h2 : B.leftQuotient '' {c : List α | aA, x₁ = a ++ c} = B.leftQuotient '' {c : List α | aA, x₂ = a ++ c}) :
          (A * B).leftQuotient x₁ = (A * B).leftQuotient x₂

          The left-quotient of A * B by x is determined by the pair (A.leftQuotient x, { B.leftQuotient c | a ∈ A, x = a ++ c }).

          If two strings x₁, x₂ agree in both components then they have the same left-quotient under the concatenation A * B. This is the key lemma for showing A * B is regular via Myhill–Nerode.

          theorem RegularLanguages.regular_concat {α : Type u_1} {A B : Language α} (hA : A.IsRegular) (hB : B.IsRegular) :
          (A * B).IsRegular

          Closure of regular languages under concatenation.

          If A and B are regular then so is A · B = A * B. The proof uses the Myhill–Nerode characterization, showing that A * B has finitely many left-quotients because each is determined by a pair lying in the product of two finite sets.