Documentation

Atlas.TheoryOfComputation.code.CFGtoPDA

def CFGtoPDA.IsPDARecognizable {T : Type u_pda} (A : Language T) :

A language A ⊆ Σ* is PDA-recognizable if there exist state and stack alphabet types together with a pushdown automaton P = (Q, Σ, Γ, δ, q₀, F) whose accepted language equals A.

Instances For
    inductive CFGtoPDA.QState (T : Type u_1) (N : Type) :
    Type u_1

    States of the PDA constructed from a CFG. The PDA has four kinds of states:

    • start — initial state; pushes the start variable and the bottom-of-stack marker;
    • loop — main loop that either pops a terminal matching the input, applies a rule by replacing a nonterminal on the stack, or transitions to accept when the bottom-of-stack marker is seen;
    • pushing L — intermediate states used to push a list L of grammar symbols onto the stack one symbol per step;
    • accept — the unique accepting state.
    Instances For
      def CFGtoPDA.toPDA {T : Type u_pda} (g : ContextFreeGrammar T) :
      PDA T (QState T g.NT) (Option (Symbol T g.NT))

      The standard PDA construction from a CFG g (Sipser, Lecture 4, "Converting CFGs to PDAs"). The stack alphabet is Option (Symbol T g.NT), where none is the bottom-of-stack marker and some s carries a grammar symbol s. The PDA:

      1. From start, pushes the bottom marker and moves into pushing [S] where S = g.initial.
      2. In loop, pops a nonterminal A and nondeterministically replaces it by the right-hand side of any rule A → r.output (using the pushing states to push symbols one at a time).
      3. In loop, pops a terminal t from the stack only if the next input symbol matches t.
      4. When the bottom marker is exposed in loop, moves to accept.
      Instances For
        theorem CFGtoPDA.toPDA_push_list {T : Type u_pda} {g : ContextFreeGrammar T} (L : List (Symbol T g.NT)) (input : List T) (stack : List (Option (Symbol T g.NT))) :

        From state pushing L on stack stack, the PDA can reach state loop having pushed all symbols of L onto the stack (resulting in L.reverse.map some ++ stack), without consuming any input.

        theorem CFGtoPDA.toPDA_pop_terminal {T : Type u_pda} {g : ContextFreeGrammar T} (t : T) (input : List T) (stack : List (Option (Symbol T g.NT))) :

        In loop, the PDA can pop a terminal t from the top of the stack while consuming a matching input symbol t.

        theorem CFGtoPDA.toPDA_apply_rule {T : Type u_pda} {g : ContextFreeGrammar T} (r : ContextFreeRule T g.NT) (hr : r g.rules) (input : List T) (stack : List (Option (Symbol T g.NT))) :

        In loop, if a nonterminal r.input is on top of the stack and r ∈ g.rules is a production with that left-hand side, the PDA can replace the nonterminal by r.output (reaching state loop with r.output.map some ++ stack on the stack).

        inductive CFGtoPDA.DerivesIn' {T : Type u_pda} (g : ContextFreeGrammar T) :
        List (Symbol T g.NT)List (Symbol T g.NT)Prop

        DerivesIn' g n γ δ means γ ⇒ⁿ δ in g: there is a derivation of length exactly n from γ to δ using the rules of g.

        Instances For
          theorem CFGtoPDA.derivesIn'_zero_eq {T : Type u_pda} {g : ContextFreeGrammar T} {γ δ : List (Symbol T g.NT)} (h : DerivesIn' g 0 γ δ) :
          γ = δ

          A zero-step derivation forces the start and end sentential forms to be equal.

          theorem CFGtoPDA.derivesIn'_snoc {T : Type u_pda} {g : ContextFreeGrammar T} {n : } {γ mid δ : List (Symbol T g.NT)} (h : DerivesIn' g n γ mid) (hp : g.Produces mid δ) :
          DerivesIn' g (n + 1) γ δ

          Append a single derivation step at the end: if γ ⇒ⁿ mid and mid ⇒ δ, then γ ⇒^{n+1} δ.

          theorem CFGtoPDA.derives_to_derivesIn' {T : Type u_pda} {g : ContextFreeGrammar T} {γ δ : List (Symbol T g.NT)} (h : g.Derives γ δ) :
          ∃ (n : ), DerivesIn' g n γ δ

          Any derivation γ ⇒* δ can be witnessed by a derivation of some explicit finite length n.

          theorem CFGtoPDA.produces_cons_terminal {T : Type u_pda} {g : ContextFreeGrammar T} {t : T} {γ' δ : List (Symbol T g.NT)} (h : g.Produces (Symbol.terminal t :: γ') δ) :
          ∃ (δ' : List (Symbol T g.NT)), δ = Symbol.terminal t :: δ' g.Produces γ' δ'

          If a single derivation step is applied to (terminal t) :: γ', the leading terminal is preserved: the result is (terminal t) :: δ' where γ' ⇒ δ'.

          theorem CFGtoPDA.derivesIn'_cons_terminal {T : Type u_pda} {g : ContextFreeGrammar T} {n : } {t : T} {γ' δ : List (Symbol T g.NT)} (h : DerivesIn' g n (Symbol.terminal t :: γ') δ) :
          ∃ (δ' : List (Symbol T g.NT)), δ = Symbol.terminal t :: δ' DerivesIn' g n γ' δ'

          Iterated version of produces_cons_terminal: in an n-step derivation starting from (terminal t) :: γ', the leading terminal is preserved throughout, and the suffix admits a matching n-step derivation.

          theorem CFGtoPDA.derivesIn'_nil_inv {T : Type u_pda} {g : ContextFreeGrammar T} {n : } {δ : List (Symbol T g.NT)} (h : DerivesIn' g n [] δ) :
          n = 0 δ = []

          A derivation starting from the empty sentential form must be a zero-step derivation and end at the empty sentential form.

          theorem CFGtoPDA.leftmost_step_n {T : Type u_pda} {g : ContextFreeGrammar T} (n : ) (A : g.NT) (γ' target : List (Symbol T g.NT)) (h : DerivesIn' g (n + 1) (Symbol.nonterminal A :: γ') target) (h_target : starget, ∃ (t : T), s = Symbol.terminal t) :
          rg.rules, r.input = A mn, DerivesIn' g m (r.output ++ γ') target

          Reordering derivations to apply the leftmost nonterminal first. If A :: γ' ⇒^{n+1} target and target is a string of terminals, then some rule r ∈ g.rules with r.input = A is applied (effectively first) and r.output ++ γ' ⇒^m target for some m ≤ n.

          theorem CFGtoPDA.toPDA_simulates {T : Type u_pda} {g : ContextFreeGrammar T} (n : ) (γ : List (Symbol T g.NT)) (w : List T) (extra : List (Option (Symbol T g.NT))) :

          Forward simulation. If the sentential form γ derives the terminal string w in n steps, then starting in state loop with γ (encoded on the stack above an arbitrary suffix extra) and input w, the PDA can reach state loop having consumed all input and exposed extra on top of the stack.

          theorem CFGtoPDA.derives_cons_same {T : Type u_pda} {g : ContextFreeGrammar T} (s : Symbol T g.NT) {γ δ : List (Symbol T g.NT)} (h : g.Derives γ δ) :
          g.Derives (s :: γ) (s :: δ)

          Prepending the same symbol to both sides preserves derivability: if γ ⇒* δ then s :: γ ⇒* s :: δ.

          theorem CFGtoPDA.no_step_from_accept {T : Type u_pda} {g : ContextFreeGrammar T} {w' : List T} {stk' : List (Option (Symbol T g.NT))} {cfg' : PDA.Config} (h : (toPDA g).Step (QState.accept, w', stk') cfg') :

          The accept state of the constructed PDA has no outgoing transitions: no Step originates from a configuration whose state is accept.

          theorem CFGtoPDA.reaches_from_accept {T : Type u_pda} {g : ContextFreeGrammar T} {w' : List T} {stk' : List (Option (Symbol T g.NT))} {cfg' : PDA.Config} (h : (toPDA g).Reaches (QState.accept, w', stk') cfg') :
          cfg' = (QState.accept, w', stk')

          The accept state is a sink: any configuration reachable from (accept, w', stk') is (accept, w', stk') itself.

          inductive CFGtoPDA.StepCount {T : Type u_pda} {σ : Type u_1} {γ₁ : Type u_2} (M : PDA T σ γ₁) :

          StepCount M n c₁ c₂ means the PDA M reaches configuration c₂ from c₁ in exactly n steps. This is the step-counted refinement of M.Reaches.

          Instances For
            theorem CFGtoPDA.StepCount.to_reaches {T : Type u_pda} {σ : Type u_1} {γ₁ : Type u_2} {M : PDA T σ γ₁} {n : } {c1 c2 : PDA.Config} (h : StepCount M n c1 c2) :
            M.Reaches c1 c2

            A bounded StepCount derivation gives an unbounded Reaches derivation.

            theorem CFGtoPDA.StepCount.snoc {T : Type u_pda} {σ : Type u_1} {γ₁ : Type u_2} {M : PDA T σ γ₁} {n : } {c1 c2 c3 : PDA.Config} (h1 : StepCount M n c1 c2) (h2 : M.Step c2 c3) :
            StepCount M (n + 1) c1 c3

            Append a single step at the end of a StepCount sequence: if c₁ →ⁿ c₂ and c₂ → c₃ then c₁ →^{n+1} c₃.

            theorem CFGtoPDA.stepCount_of_reaches {T : Type u_pda} {σ : Type u_1} {γ₁ : Type u_2} {M : PDA T σ γ₁} {c1 c2 : PDA.Config} (h : M.Reaches c1 c2) :
            ∃ (n : ), StepCount M n c1 c2

            Every Reaches derivation can be witnessed by an explicit StepCount of some length n.

            theorem CFGtoPDA.step_inv_aux {T : Type u_pda} {g : ContextFreeGrammar T} {q_src : QState T g.NT} {w_src : List T} {stk_src : List (Option (Symbol T g.NT))} {cfg' : PDA.Config} (h : (toPDA g).Step (q_src, w_src, stk_src) cfg') :
            ∃ (q' : QState T g.NT) (a : Option T) (pop : Option (Option (Symbol T g.NT))) (push : Option (Option (Symbol T g.NT))) (inp : List T) (stk : List (Option (Symbol T g.NT))), w_src = a.toList ++ inp stk_src = pop.toList ++ stk cfg' = (q', inp, push.toList ++ stk) (q', push) (toPDA g).step q_src a pop

            Inversion lemma for a single PDA step: any transition out of (q_src, w_src, stk_src) arises from some choice of next state q', optional consumed input symbol a, optional popped stack symbol pop, optional pushed stack symbol push, and tails inp, stk of the input and stack respectively, with (q', push) ∈ δ q_src a pop.

            theorem CFGtoPDA.pushing_stepCount_split {T : Type u_pda} {g : ContextFreeGrammar T} (n : ) {L : List (Symbol T g.NT)} {w_arg : List T} {stk_arg : List (Option (Symbol T g.NT))} {cfg' : PDA.Config} (h : StepCount (toPDA g) n (QState.pushing L, w_arg, stk_arg) cfg') (hn : n L.length + 1) :
            StepCount (toPDA g) (n - (L.length + 1)) (QState.loop, w_arg, List.map some L.reverse ++ stk_arg) cfg'

            Pushing a list L of length |L| consumes exactly |L| + 1 steps (one push per symbol plus one final transition from pushing [] to loop). If a StepCount from pushing L already takes at least |L| + 1 steps, then after that prefix the PDA is in (loop, w_arg, L.reverse.map some ++ stk_arg), and the remaining n - (|L|+1) steps reach the same target.

            theorem CFGtoPDA.pushing_needs_steps {T : Type u_pda} {g : ContextFreeGrammar T} (k : ) {L : List (Symbol T g.NT)} {w_arg : List T} {stk_arg : List (Option (Symbol T g.NT))} {cfg' : PDA.Config} (h : StepCount (toPDA g) k (QState.pushing L, w_arg, stk_arg) cfg') (hk : k < L.length + 1) :
            ∃ (L' : List (Symbol T g.NT)), cfg'.1 = QState.pushing L'

            Lower bound on the time to leave the pushing family of states. If fewer than |L| + 1 steps have been taken starting from pushing L, the PDA is still in some pushing L' state.

            Backward simulation (loop-to-loop). If from (loop, w, γ.map some ++ [⊥]) the PDA reaches (loop, [], [⊥]) in exactly n steps (where ⊥ = none is the bottom-of-stack marker), then in the grammar γ ⇒* w (with w viewed as a sequence of terminals).

            This is the core invariant for the converse direction: a successful run of the PDA from loop back to loop while consuming w and shrinking the stack from γ to empty corresponds to a derivation γ ⇒* w.

            theorem CFGtoPDA.stack_invariant_step {T : Type u_pda} {g : ContextFreeGrammar T} {c1 c2 : PDA.Config} (hs : (toPDA g).Step c1 c2) {L : List (Symbol T g.NT)} (hstk : c1.2.2 = List.map some L ++ [none]) (hna1 : c1.1 QState.accept) (hna2 : c2.1 QState.accept) (hns1 : c1.1 QState.start) :
            ∃ (L' : List (Symbol T g.NT)), c2.2.2 = List.map some L' ++ [none]

            Stack-shape invariant for one step. As long as the PDA is not transitioning into or out of the special start / accept states, a stack of the form L.map some ++ [none] (a list of grammar symbols on top of the bottom marker) is preserved as L'.map some ++ [none] after one step, for some new list L'.

            theorem CFGtoPDA.no_target_start {T : Type u_pda} {g : ContextFreeGrammar T} {c1 c2 : PDA.Config} (h : (toPDA g).Step c1 c2) :

            No transition in the constructed PDA targets the start state: once the PDA has left start, it never returns.

            theorem CFGtoPDA.stack_invariant_reaches {T : Type u_pda} {g : ContextFreeGrammar T} {c1 c2 : PDA.Config} (h : (toPDA g).Reaches c1 c2) {L : List (Symbol T g.NT)} (hstk : c1.2.2 = List.map some L ++ [none]) (hna1 : c1.1 QState.accept) (hna2 : c2.1 QState.accept) (hns1 : c1.1 QState.start) :
            ∃ (L' : List (Symbol T g.NT)), c2.2.2 = List.map some L' ++ [none]

            Reachable-version of stack_invariant_step: along any run between two non-accept configurations (where the start configuration is also not start), the stack shape L.map some ++ [⊥] is preserved, possibly with a different list L'.

            Backward direction, auxiliary form. Any accepting run of the constructed PDA on w — starting in (start, w, []) and ending in (accept, [], stk) for some final stack stk — yields a CFG derivation [S] ⇒* w from the start variable S = g.initial.

            Backward direction. If the PDA toPDA g accepts w, then g derives w from its start variable, i.e. w ∈ L(g).

            Correctness of the CFG → PDA construction. The PDA produced by toPDA g recognizes exactly the language of g: L(toPDA g) = L(g).

            Theorem (Converting CFGs to PDAs), Sipser Lecture 4. If A is a context-free language, then some pushdown automaton recognizes A.