Documentation

Atlas.HighDimensionalStatistics.code.Chapter5.DefProp_5_4_Helpers

theorem TotalVariation.measure_diff_le_of_positive_set {Ω : Type u_1} [MeasurableSpace Ω] (P₀ P₁ μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure P₀] [MeasureTheory.IsProbabilityMeasure P₁] [MeasureTheory.SigmaFinite μ] (hac₀ : P₀.AbsolutelyContinuous μ) (hac₁ : P₁.AbsolutelyContinuous μ) (T : Set Ω) (hT : MeasurableSet T) :
(P₀ T).toReal - (P₁ T).toReal (P₀ {ω : Ω | P₁.rnDeriv μ ω < P₀.rnDeriv μ ω}).toReal - (P₁ {ω : Ω | P₁.rnDeriv μ ω < P₀.rnDeriv μ ω}).toReal

Key sub-lemma: for any measurable T, P₀(T) - P₁(T) ≤ P₀(R) - P₁(R) where R = {dP₁/dμ < dP₀/dμ} is the positive set of the signed measure P₀ - P₁. The proof splits T and R along their intersection and uses that (p₀-p₁) is nonneg on R and nonpos on Rᶜ.

theorem TotalVariation.tvDist_eq_diff_on_positive_set {Ω : Type u_1} [MeasurableSpace Ω] (P₀ P₁ μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure P₀] [MeasureTheory.IsProbabilityMeasure P₁] [MeasureTheory.SigmaFinite μ] (hac₀ : P₀.AbsolutelyContinuous μ) (hac₁ : P₁.AbsolutelyContinuous μ) :
Chapter5.TVNP.tvDist P₀ P₁ = (P₀ {ω : Ω | P₁.rnDeriv μ ω < P₀.rnDeriv μ ω}).toReal - (P₁ {ω : Ω | P₁.rnDeriv μ ω < P₀.rnDeriv μ ω}).toReal

The TV supremum is achieved at the set R = {p₁ < p₀}. TV(P₀, P₁) = P₀(R) - P₁(R) where R = {dP₁/dμ < dP₀/dμ}.

Definition-Proposition 5.4 (characterization (ii)): TV(P₀, P₁) = sup_{A measurable} |∫_A (p₀ - p₁) dμ| where p₀, p₁ are Radon-Nikodym derivatives w.r.t. a dominating measure μ.

Definition-Proposition 5.4 (characterization (iii)): TV(P₀, P₁) = (1/2) ∫ |p₀ - p₁| dμ (L¹ characterization).

Definition-Proposition 5.4 (characterization (iv)): TV(P₀, P₁) = 1 - ∫ min(p₀, p₁) dμ.

theorem TotalVariation.tvDist_le_testing_error {Ω : Type u_1} {x✝ : MeasurableSpace Ω} (P₀ P₁ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure P₀] [MeasureTheory.IsProbabilityMeasure P₁] (ψ : ΩBool) ( : Measurable ψ) :
(P₀ {ω : Ω | ψ ω = true}).toReal + (P₁ {ω : Ω | ψ ω = false}).toReal 1 - Chapter5.TVNP.tvDist P₀ P₁

Definition-Proposition 5.4 (characterization (v), inequality direction): For any test ψ, P₀(ψ=1) + P₁(ψ=0) ≥ 1 - TV(P₀, P₁).

Definition-Proposition 5.4 (characterization (v), equality direction): The LR test achieves P₀(ψ*=1) + P₁(ψ*=0) = 1 - TV(P₀, P₁).

theorem TotalVariation.defProp_5_4 {Ω : Type u_1} [MeasurableSpace Ω] (P₀ P₁ μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure P₀] [MeasureTheory.IsProbabilityMeasure P₁] [MeasureTheory.SigmaFinite μ] (hac₀ : P₀.AbsolutelyContinuous μ) (hac₁ : P₁.AbsolutelyContinuous μ) :
Chapter5.TVNP.tvDist P₀ P₁ = sSup {x : | ∃ (S : Set Ω), MeasurableSet S x = | (ω : Ω) in S, (P₀.rnDeriv μ ω).toReal - (P₁.rnDeriv μ ω).toReal μ|} Chapter5.TVNP.tvDist P₀ P₁ = 1 / 2 * (x : Ω), |(P₀.rnDeriv μ x).toReal - (P₁.rnDeriv μ x).toReal| μ Chapter5.TVNP.tvDist P₀ P₁ = 1 - (x : Ω), min (P₀.rnDeriv μ x).toReal (P₁.rnDeriv μ x).toReal μ (P₀ {ω : Ω | Chapter5.TVNP.lrTest P₁ P₀ μ ω = true}).toReal + (P₁ {ω : Ω | Chapter5.TVNP.lrTest P₁ P₀ μ ω = false}).toReal = 1 - Chapter5.TVNP.tvDist P₀ P₁

Definition-Proposition 5.4 (combined statement): All five characterizations of the total variation distance are equivalent:

(i) TV(P₀, P₁) = sup_{A measurable} |P₀(A) - P₁(A)| (definition) (ii) TV(P₀, P₁) = sup_{A measurable} |∫_A (p₀ - p₁) dμ| (iii) TV(P₀, P₁) = (1/2) ∫ |p₀ - p₁| dμ (iv) TV(P₀, P₁) = 1 - ∫ min(p₀, p₁) dμ (v) TV(P₀, P₁) = 1 - inf_ψ [P₀(ψ=1) + P₁(ψ=0)]

The first characterization (i) is the definition of Chapter5.TVNP.tvDist. Characterization (v) is expressed via the LR test achieving the infimum.