Documentation

Atlas.HighDimensionalStatistics.code.Chapter1.Problem_1_8

Problem 1.8: Median-of-Means Estimator #

Informal statement (Rigollet, Problem 1.8): Let X₁, …, Xₙ be n independent random variables with E[Xᵢ] = μ and Var(Xᵢ) ≤ σ². Fix δ ∈ (0,1) and factor n = K · G where G = ⌈8 log(1/δ)⌉. For g = 1,…,G, let X̄_g = (1/K) ∑_{i in group g} Xᵢ be the group mean.

Parts #

  1. Chebyshev on group means: Show P[X̄_g − μ > 2σ/√K] ≤ 1/4. Key idea: Var(X̄_g) = σ²/K by independence. Apply Chebyshev's inequality: P[|X̄_g − μ| ≥ 2σ/√K] ≤ Var(X̄_g)/(2σ/√K)² = (σ²/K)/(4σ²/K) = 1/4.

  2. Median reduces to binomial: Let μ̂ = median of {X̄₁,…,X̄_G}. Show P[μ̂ − μ > 2σ/√K] ≤ P[B ≥ G/2] where B ~ Bin(G, 1/4). Key idea: If the median exceeds μ + t, then at least half the group means exceed μ + t. Each exceeds with probability ≤ 1/4 (by Part 1), and by independence the count is dominated by Bin(G, 1/4).

  3. Full guarantee: Conclude P[μ̂ − μ > 4σ√(2 log(1/δ)/n)] ≤ δ. Key idea: Substituting K = n/G and G = 8 log(1/δ), the threshold 2σ/√K = 4σ√(2 log(1/δ)/n). Hoeffding's inequality for the binomial gives P[Bin(G,1/4) ≥ G/2] ≤ exp(−G/8) ≤ δ.

  4. Comparison: The book asks to compare with Hoeffding (Thm 1.7) and Chebyshev (Lemma 1.3). The median-of-means achieves sub-Gaussian-type concentration (O(σ√(log(1/δ)/n))) using only a second-moment assumption (bounded variance), without needing sub-Gaussian tails. It cannot conclude μ̂ − μ ~ subG(σ̄²/n) because the bound holds only for one-sided deviations and the median is not a linear statistic.

Lean formalization #

theorem Problem_1_8.problem_1_8_part1 {Ω : Type u_1} [MeasurableSpace Ω] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {X_bar : Ω} {m σ_sq : } {K : } (hK : 0 < K) (hσ_sq : 0 σ_sq) (_hMeas : MeasureTheory.AEStronglyMeasurable X_bar P) (hMemLp : MeasureTheory.MemLp X_bar 2 P) (hE : (ω : Ω), X_bar ω P = m) (hVar : ProbabilityTheory.variance X_bar P σ_sq / K) :
P {ω : Ω | X_bar ω - m > 2 * (σ_sq / K)} ENNReal.ofReal (1 / 4)

Problem 1.8, Part 1: Chebyshev bound on a group mean.

If X̄ is a random variable with E[X̄] = m and Var(X̄) ≤ σ²/K, then P[X̄ - m > 2σ/√K] ≤ 1/4.

This follows from Chebyshev's inequality applied with threshold t = 2σ/√K: P[|X̄ - m| ≥ t] ≤ Var(X̄)/t² ≤ (σ²/K)/(4σ²/K) = 1/4. Since {X̄ - m > t} ⊆ {|X̄ - m| ≥ t}, the bound follows.

Here X̄ represents a single group mean X̄_g = (1/K) Σᵢ Xᵢ over a group of K independent random variables each with mean m and variance ≤ σ². The variance of the group mean is Var(X̄_g) = σ²/K by independence.

theorem Problem_1_8.problem_1_8_part2 {Ω : Type u_1} [MeasurableSpace Ω] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {G : } (hG : 0 < G) {X_bar : Fin GΩ} {μ_hat : Ω} {m t : } (hMedian : ∀ (ω : Ω) (s : ), μ_hat ω > s → (G + 1) / 2 {g : Fin G | X_bar g ω > s}.card) (hBound : ∀ (g : Fin G), P {ω : Ω | X_bar g ω - m > t} ENNReal.ofReal (1 / 4)) :
P {ω : Ω | μ_hat ω - m > t} P {ω : Ω | (G + 1) / 2 {g : Fin G | X_bar g ω - m > t}.card}

Problem 1.8, Part 2: Median bound via binomial comparison.

Given G group means X̄₁,...,X̄_G and their median μ̂, if each satisfies P[X̄_g - m > t] ≤ 1/4, then P[μ̂ - m > t] ≤ P[#{g : X̄_g - m > t} ≥ ⌈G/2⌉].

The RHS equals P[B ≥ G/2] where B ~ Bin(G, 1/4) when the group means are independent, since B counts the number of group means exceeding m + t.

The key property of the median used is: if the median exceeds m + t, then at least half of the G group means must also exceed m + t. Therefore the event {μ̂ - m > t} is contained in the event that at least ⌈G/2⌉ group means exceed the threshold, giving the desired probability bound.

theorem Problem_1_8.problem_1_8_part3 {Ω : Type u_1} [MeasurableSpace Ω] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {n K G : } (hn : n = K * G) (hK : 0 < K) (hG : 0 < G) {X : Fin nΩ} {μ_hat : Ω} {m σ_sq δ : } (hδ_pos : 0 < δ) (hδ_le : δ 1) (hσ_sq : 0 σ_sq) (hIndep : ProbabilityTheory.iIndepFun X P) (hMeas : ∀ (i : Fin n), Measurable (X i)) (hMemLp : ∀ (i : Fin n), MeasureTheory.MemLp (X i) 2 P) (hE : ∀ (i : Fin n), (ω : Ω), X i ω P = m) (hVar : ∀ (i : Fin n), ProbabilityTheory.variance (X i) P σ_sq) (hG_bound : G 8 * Real.log (1 / δ)) (hμ_hat : ∀ (ω : Ω) (s : ), μ_hat ω > s → (G + 1) / 2 {g : Fin G | 1 / K * j : Fin K, X g * K + j, ω > s}.card) :
P {ω : Ω | μ_hat ω - m > 4 * (2 * σ_sq * Real.log (1 / δ) / n)} ENNReal.ofReal δ

Problem 1.8, Part 3: Full median-of-means guarantee.

Let X₁,...,Xₙ be independent with E[Xᵢ] = m and Var(Xᵢ) ≤ σ². Factor n = K · G with G ≥ 8 log(1/δ). Let μ̂ be the median of group means. Then: P[μ̂ - m > 4σ√(2 log(1/δ)/n)] ≤ δ.

This combines Parts 1 and 2 with the substitution:

  • Each group has K = n/G variables, so Var(X̄_g) ≤ σ²/K.
  • The threshold 2σ/√K = 2σ√(G/n) = 2σ√(8 log(1/δ)/n) = 4σ√(2 log(1/δ)/n).
  • Hoeffding's inequality gives P[Bin(G,1/4) ≥ G/2] ≤ exp(-2G·(1/4)²) = exp(-G/8).
  • Since G ≥ 8 log(1/δ), we have exp(-G/8) ≤ exp(-log(1/δ)) = δ.