Problem 1.8: Median-of-Means Estimator #
Informal statement (Rigollet, Problem 1.8): Let X₁, …, Xₙ be n independent random variables with E[Xᵢ] = μ and Var(Xᵢ) ≤ σ². Fix δ ∈ (0,1) and factor n = K · G where G = ⌈8 log(1/δ)⌉. For g = 1,…,G, let X̄_g = (1/K) ∑_{i in group g} Xᵢ be the group mean.
Parts #
Chebyshev on group means: Show P[X̄_g − μ > 2σ/√K] ≤ 1/4. Key idea: Var(X̄_g) = σ²/K by independence. Apply Chebyshev's inequality: P[|X̄_g − μ| ≥ 2σ/√K] ≤ Var(X̄_g)/(2σ/√K)² = (σ²/K)/(4σ²/K) = 1/4.
Median reduces to binomial: Let μ̂ = median of {X̄₁,…,X̄_G}. Show P[μ̂ − μ > 2σ/√K] ≤ P[B ≥ G/2] where B ~ Bin(G, 1/4). Key idea: If the median exceeds μ + t, then at least half the group means exceed μ + t. Each exceeds with probability ≤ 1/4 (by Part 1), and by independence the count is dominated by Bin(G, 1/4).
Full guarantee: Conclude P[μ̂ − μ > 4σ√(2 log(1/δ)/n)] ≤ δ. Key idea: Substituting K = n/G and G = 8 log(1/δ), the threshold 2σ/√K = 4σ√(2 log(1/δ)/n). Hoeffding's inequality for the binomial gives P[Bin(G,1/4) ≥ G/2] ≤ exp(−G/8) ≤ δ.
Comparison: The book asks to compare with Hoeffding (Thm 1.7) and Chebyshev (Lemma 1.3). The median-of-means achieves sub-Gaussian-type concentration (O(σ√(log(1/δ)/n))) using only a second-moment assumption (bounded variance), without needing sub-Gaussian tails. It cannot conclude μ̂ − μ ~ subG(σ̄²/n) because the bound holds only for one-sided deviations and the median is not a linear statistic.
Lean formalization #
- Part 1 is fully proved using Chebyshev's inequality from Mathlib.
- Parts 2 and 3 are stated with
sorry— the median property is encoded viahMedian/hμ_hathypotheses asserting that if the median exceeds a threshold, at least ⌈G/2⌉ group means do too.
Problem 1.8, Part 1: Chebyshev bound on a group mean.
If X̄ is a random variable with E[X̄] = m and Var(X̄) ≤ σ²/K, then P[X̄ - m > 2σ/√K] ≤ 1/4.
This follows from Chebyshev's inequality applied with threshold t = 2σ/√K: P[|X̄ - m| ≥ t] ≤ Var(X̄)/t² ≤ (σ²/K)/(4σ²/K) = 1/4. Since {X̄ - m > t} ⊆ {|X̄ - m| ≥ t}, the bound follows.
Here X̄ represents a single group mean X̄_g = (1/K) Σᵢ Xᵢ over a group of K independent random variables each with mean m and variance ≤ σ². The variance of the group mean is Var(X̄_g) = σ²/K by independence.
Problem 1.8, Part 2: Median bound via binomial comparison.
Given G group means X̄₁,...,X̄_G and their median μ̂, if each satisfies P[X̄_g - m > t] ≤ 1/4, then P[μ̂ - m > t] ≤ P[#{g : X̄_g - m > t} ≥ ⌈G/2⌉].
The RHS equals P[B ≥ G/2] where B ~ Bin(G, 1/4) when the group means are independent, since B counts the number of group means exceeding m + t.
The key property of the median used is: if the median exceeds m + t, then at least half of the G group means must also exceed m + t. Therefore the event {μ̂ - m > t} is contained in the event that at least ⌈G/2⌉ group means exceed the threshold, giving the desired probability bound.
Problem 1.8, Part 3: Full median-of-means guarantee.
Let X₁,...,Xₙ be independent with E[Xᵢ] = m and Var(Xᵢ) ≤ σ². Factor n = K · G with G ≥ 8 log(1/δ). Let μ̂ be the median of group means. Then: P[μ̂ - m > 4σ√(2 log(1/δ)/n)] ≤ δ.
This combines Parts 1 and 2 with the substitution:
- Each group has K = n/G variables, so Var(X̄_g) ≤ σ²/K.
- The threshold 2σ/√K = 2σ√(G/n) = 2σ√(8 log(1/δ)/n) = 4σ√(2 log(1/δ)/n).
- Hoeffding's inequality gives P[Bin(G,1/4) ≥ G/2] ≤ exp(-2G·(1/4)²) = exp(-G/8).
- Since G ≥ 8 log(1/δ), we have exp(-G/8) ≤ exp(-log(1/δ)) = δ.