Documentation

Atlas.HighDimensionalStatistics.code.Chapter5.Problem_5_2

Problem 5.2 — Exercise #

From: Rigollet, "High Dimensional Statistics", Chapter 5 Problem Set

Exercise for the reader. This problem is left as an exercise in the textbook and does not need to be fully proven in the formalization. The sorry below is intentional — it marks a non-trivial statistical argument (Neyman-Pearson + AM-GM + Pinsker's inequality chain) that is outside the scope of the core chapter formalization.

Statement: For probability measures P₀, P₁ and any measurable test ψ, max(P₀(ψ≠0), P₁(ψ≠1)) ≥ (1/4)·exp(-KL(P₀,P₁)).

noncomputable def Chapter5.Problem52.klDivReal {Ω : Type u_1} [MeasurableSpace Ω] (P Q : MeasureTheory.Measure Ω) :

Real-valued KL divergence. Defined as (klDiv P Q).toReal, where klDiv is Mathlib's InformationTheory.klDiv taking values in ℝ≥0∞. For probability measures P absolutely continuous w.r.t. Q with integrable log-likelihood ratio, this equals ∫ log(dP/dQ) dP.

Instances For

    KL divergence (real-valued) is nonneg, since it is the toReal of an ENNReal value.

    theorem Chapter5.Problem52.problem_5_2 {Ω : Type u_1} {x✝ : MeasurableSpace Ω} (P₀ P₁ : MeasureTheory.Measure Ω) (hP₀ : MeasureTheory.IsProbabilityMeasure P₀) (hP₁ : MeasureTheory.IsProbabilityMeasure P₁) (ψ : ΩBool) ( : Measurable ψ) :
    max (P₀ {ω : Ω | ψ ω = true}).toReal (P₁ {ω : Ω | ψ ω = false}).toReal 1 / 4 * Real.exp (-klDivReal P₀ P₁)

    Problem 5.2. For probability measures P₀, P₁ and any measurable test ψ, max(P₀(ψ≠0), P₁(ψ≠1)) ≥ (1/4)·exp(-KL(P₀,P₁)).

    Hint (from book): Use Neyman-Pearson (Lemma 5.3) and the inequality min(a,b) ≤ √(ab) (AM-GM) combined with the variational characterization of TV distance and Pinsker's inequality. Specifically: P₀(ψ=1) + P₁(ψ=0) ≥ 1 - TV(P₀,P₁) ≥ 1 - √(KL(P₀,P₁)) Then use the sharper bound max(P₀(ψ≠0), P₁(ψ≠1)) ≥ (1/4)·exp(-KL) which follows from the relation between the Hellinger affinity and KL.