Atlas.HighDimensionalStatistics.code.Chapter5.Thm_5

source

noncomputable def FanoInequality.mixtureMeasure {Ω : Type u_1} [MeasurableSpace Ω] {M : ℕ} (P : Fin M → MeasureTheory.Measure Ω) :

MeasureTheory.Measure Ω

The mixture (average) measure: P̄ = (1/M) · Σⱼ Pⱼ.

This is the marginal distribution of X when the hypothesis index Z is drawn uniformly from {1, ..., M} and X | Z=j ∼ Pⱼ.

Instances For

source

theorem FanoInequality.mixtureMeasure_eq_infoTheory {Ω : Type u_1} [MeasurableSpace Ω] {M : ℕ} (P : Fin M → MeasureTheory.Measure Ω) :

mixtureMeasure P = InfoTheory.mixtureMeasure M P

Bridge lemma: the local mixtureMeasure equals InfoTheory.mixtureMeasure. The difference is purely syntactic: (M)⁻¹ vs 1 / M in ENNReal, which are definitionally equal via one_div.

source

theorem FanoInequality.fano_lemma {Ω : Type u_1} [MeasurableSpace Ω] (M : ℕ) (hM : 3 ≤ M) (P : Fin M → MeasureTheory.Measure Ω) [∀ (j : Fin M), MeasureTheory.IsProbabilityMeasure (P j)] (hac : ∀ (j k : Fin M), (P j).AbsolutelyContinuous (P k)) (hfin : ∀ (j k : Fin M), InformationTheory.klDiv (P j) (P k) ≠ ⊤) (ψ : Ω → Fin M) :

Measurable ψ → ∃ (j : Fin M), ((P j) {ω : Ω | ψ ω ≠ j}).toReal ≥ 1 - (1 / ↑M * ∑ j : Fin M, (InformationTheory.klDiv (P j) (mixtureMeasure P)).toReal + Real.log 2) / Real.log (↑M - 1)

Fano's lemma (information-theoretic core).

Derived from InfoTheory.fano_lemma. The local version adds the hac hypothesis (unused by the axiom but required by calling code) and universally quantifies over ψ in the conclusion.

source

theorem FanoInequality.kl_mixture_le_avg_pairwise {Ω : Type u_1} [MeasurableSpace Ω] (M : ℕ) (hM : 2 ≤ M) (P : Fin M → MeasureTheory.Measure Ω) [∀ (j : Fin M), MeasureTheory.IsProbabilityMeasure (P j)] (hac : ∀ (j k : Fin M), (P j).AbsolutelyContinuous (P k)) (hfin : ∀ (j k : Fin M), InformationTheory.klDiv (P j) (P k) ≠ ⊤) :

1 / ↑M * ∑ j : Fin M, (InformationTheory.klDiv (P j) (mixtureMeasure P)).toReal ≤ 1 / ↑M ^ 2 * ∑ j : Fin M, ∑ k : Fin M, (InformationTheory.klDiv (P j) (P k)).toReal

KL to mixture ≤ average pairwise KL (Jensen's inequality for KL).

Derived from InfoTheory.kl_mixture_le_avg_pairwise.

source

theorem FanoInequality.fano_avg_numerator_bound {Ω : Type u_1} [MeasurableSpace Ω] (M : ℕ) (hM : 2 ≤ M) (P : Fin M → MeasureTheory.Measure Ω) [∀ (j : Fin M), MeasureTheory.IsProbabilityMeasure (P j)] (hac : ∀ (j k : Fin M), (P j).AbsolutelyContinuous (P k)) (hfin : ∀ (j k : Fin M), InformationTheory.klDiv (P j) (P k) ≠ ⊤) (ψ : Ω → Fin M) (hψ : Measurable ψ) :

(1 / ↑M * ∑ j : Fin M, ((P j) {ω : Ω | ψ ω ≠ j}).toReal) * Real.log (↑M - 1) + Real.log 2 ≥ Real.log ↑M - 1 / ↑M ^ 2 * ∑ j : Fin M, ∑ k : Fin M, (InformationTheory.klDiv (P j) (P k)).toReal

Fano numerator bound (average error form): For M ≥ 2 and any test ψ,

pe_avg · log(M-1) + log 2 ≥ log M − (1/M²) Σⱼ Σₖ KL(Pⱼ ∥ Pₖ)

where pe_avg = (1/M) Σⱼ Pⱼ[ψ ≠ j] is the average testing error.

This is the "numerator form" of Fano's inequality that avoids dividing by log(M-1), making it valid for all M ≥ 2 (including M = 2 where log(M-1) = log 1 = 0).

Proof: Chain fano_entropy_core_bound (which gives pe_avg · log(M-1) + log 2 ≥ log M − (1/M) Σⱼ KL(Pⱼ ∥ P̄)) with kl_mixture_le_avg_pairwise (which gives (1/M) Σⱼ KL(Pⱼ ∥ P̄) ≤ (1/M²) Σⱼ Σₖ KL(Pⱼ ∥ Pₖ)).

Reference: Rigollet, proof of Theorem 5.10.

source

theorem FanoInequality.fano_inequality_general {Ω : Type u_1} [MeasurableSpace Ω] (M : ℕ) (hM : 2 ≤ M) (P : Fin M → MeasureTheory.Measure Ω) [∀ (j : Fin M), MeasureTheory.IsProbabilityMeasure (P j)] (hac : ∀ (j k : Fin M), (P j).AbsolutelyContinuous (P k)) (hfin : ∀ (j k : Fin M), InformationTheory.klDiv (P j) (P k) ≠ ⊤) (ψ : Ω → Fin M) :

Measurable ψ → ∃ (j : Fin M), ((P j) {ω : Ω | ψ ω ≠ j}).toReal * Real.log (↑M - 1) + Real.log 2 ≥ Real.log ↑M - 1 / ↑M ^ 2 * ∑ j : Fin M, ∑ k : Fin M, (InformationTheory.klDiv (P j) (P k)).toReal

Fano's inequality, numerator form (M ≥ 2).

Let P₁, ..., P_M, M ≥ 2 be probability distributions such that Pⱼ ≪ Pₖ. Then for any measurable test ψ : Ω → Fin M, there exists j such that:

Pⱼ[ψ ≠ j] · log(M-1) + log 2 ≥ log M − (1/M²) Σⱼₖ KL(Pⱼ ∥ Pₖ)

This is equivalent to the standard Fano bound Pⱼ[ψ ≠ j] ≥ 1 − (avg_KL + log 2) / log(M-1) when M ≥ 3 (so log(M-1) > 0), but it avoids division by log(M-1), making it well-defined and correct for M = 2 as well.

The proof applies fano_avg_numerator_bound (for the average error) and then uses a pigeonhole argument: some component error ≥ the average, and since log(M-1) ≥ 0, the numerator form is preserved.

Reference: Rigollet, Theorem 5.10 (restated in numerator form).

source

theorem FanoInequality.fano_inequality {Ω : Type u_1} [MeasurableSpace Ω] (M : ℕ) (hM : 3 ≤ M) (P : Fin M → MeasureTheory.Measure Ω) [∀ (j : Fin M), MeasureTheory.IsProbabilityMeasure (P j)] (hac : ∀ (j k : Fin M), (P j).AbsolutelyContinuous (P k)) (hfin : ∀ (j k : Fin M), InformationTheory.klDiv (P j) (P k) ≠ ⊤) (ψ : Ω → Fin M) :

Measurable ψ → ∃ (j : Fin M), ((P j) {ω : Ω | ψ ω ≠ j}).toReal ≥ 1 - (1 / ↑M ^ 2 * ∑ j : Fin M, ∑ k : Fin M, (InformationTheory.klDiv (P j) (P k)).toReal + Real.log 2) / Real.log (↑M - 1)

Theorem 5.10 (Fano's inequality, denominator form).

Let P₁, ..., P_M, M ≥ 2 be probability distributions on a measurable space (Ω, 𝓐) such that Pⱼ ≪ Pₖ for all j, k. Then for any M-ary test ψ : Ω → Fin M,

max_{1≤j≤M} Pⱼ[ψ(X) ≠ j] ≥ 1 − ((1/M²) Σ_{j,k} KL(Pⱼ ∥ Pₖ) + log 2) / log(M − 1)

The proof combines two steps:

fano_lemma: bounds error using average KL to the mixture P̄
kl_mixture_le_avg_pairwise: bounds KL to mixture by average pairwise KL

The combination works by monotonicity: since A ≤ B (from step 2), we have 1 − (A + log 2)/log(M−1) ≥ 1 − (B + log 2)/log(M−1) when log(M−1) ≥ 0.

Requires M ≥ 3 because dividing by log(M-1) = 0 when M = 2 produces a false bound in Lean's convention where x / 0 = 0. For M ≥ 2, use fano_inequality_general.

Documentation

Atlas.HighDimensionalStatistics.code.Chapter5.Thm_5_10