Documentation

Atlas.HighDimensionalStatistics.code.Chapter2.Cor_2_9

Corollary 2.9: Expected MSE Bound for ℓ₀-Constrained Least Squares #

This file formalizes Corollary 2.9 from Rigollet's High-Dimensional Statistics.

Main result #

Under the assumptions of Theorem 2.6, the expected MSE of the ℓ₀-constrained least squares estimator satisfies:

E[MSE(X θ̂_{B₀(k)}^LS)] ≲ σ² k / n · log(ed/k)

Proof outline #

The proof uses the layer-cake formula to convert the tail bound from Theorem 2.6 (equation 2.6) into an expectation bound:

  1. E[Z] = ∫₀^∞ P(Z > u) du ≤ H + ∫₀^∞ P(Z > u + H) du for any H ≥ 0
  2. Apply the Thm 2.6 tail bound: P(Z > n·u) ≤ C_tail · exp(-α·u) where α = n/(32σ²) and C_tail = Σ_{j=1}^{2k} C(d,j) · 6^{2k}
  3. Integrate: E[Z] ≤ H + C_tail · exp(-α·H) / α
  4. Choose H = log(C_tail)/α so that C_tail · exp(-α·H) = 1
  5. Conclude: E[Z] ≤ (log(C_tail) + 1) / α ≲ σ²k log(ed/k) / n

References #

Verification status #

All theorems in this file are fully proved (no sorry, no axiom). All transitive dependencies (including Thm_2_6 and Chapter 1 results) are also fully proved.

theorem layer_cake_choose_H (C_tail : ) (hC : 1 C_tail) (α : ) ( : 0 < α) (expected_val : ) (htail_int : ∀ (H : ), 0 Hexpected_val H + C_tail * Real.exp (-(α * H)) * (1 / α)) :
expected_val (Real.log C_tail + 1) / α

Layer-cake bound with exponential tail. If for every H ≥ 0, the expected value is bounded by H + C·exp(-αH)/α, then choosing H = log(C)/α (when C ≥ 1, α > 0) yields the bound (log(C) + 1)/α.

theorem cor_2_9_expected_mse (n d k : ) (_hn : 0 < n) (_hk : 1 k) (_hkd : 2 * k d) (σsq : ) (_hσ : 0 < σsq) (α : ) ( : 0 < α) (_hα_def : α = n / (32 * σsq)) (C_tail : ) (hC : 1 C_tail) (expected_mse : ) (htail_int : ∀ (H : ), 0 Hexpected_mse H + C_tail * Real.exp (-(α * H)) * (1 / α)) :
expected_mse (Real.log C_tail + 1) / α

Corollary 2.9 (Expected MSE bound for ℓ₀-constrained LS).

Under Theorem 2.6 assumptions, E[MSE(X θ̂)] ≲ σ²k log(ed/k)/n.

From Theorem 2.6 (equation 2.6): P(|X(θ̂ - θ*)|² > n·u) ≤ C_tail · exp(-n·u/(32σ²)) where C_tail = Σ_{j=1}^{2k} C(d,j) · 6^{2k}.

The layer-cake formula yields E[MSE] ≤ H + C_tail·exp(-αH)/α for any H ≥ 0, where α = n/(32σ²). Choosing H = log(C_tail)/α gives: E[MSE] ≤ (log(C_tail) + 1) / α

Since C_tail ≤ (ed/(2k))^{2k} · 6^{2k} (by Lemma 2.7), we have log(C_tail) ≲ 2k · log(ed/k), and 1/α = 32σ²/n, so E[MSE] ≲ σ²k log(ed/k) / n.

theorem layer_cake_exp_tail_to_integral_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (f : Ω) (hf_nn : ∀ (ω : Ω), 0 f ω) (C α : ) (hC : 0 C) ( : 0 < α) (htail : ∀ (t : ), 0 < tμ {ω : Ω | f ω > t} ENNReal.ofReal (C * Real.exp (-(α * t)))) (H : ) :
0 H (ω : Ω), f ω μ H + C * Real.exp (-(α * H)) * (1 / α)

Layer-cake formula with exponential tail bound (generic).

For a nonneg random variable f on a probability space, if the tail satisfies P(f > t) ≤ C · exp(-α·t) for all t > 0, then by the layer-cake formula E[f] = ∫₀^∞ P(f > t) dt, splitting at H ≥ 0: E[f] ≤ H + C · exp(-α·H) / α

The proof uses the layer-cake formula (Cavalieri's principle) from Mathlib (MeasureTheory.Integrable.integral_eq_integral_meas_lt), splits the integral at H into [0,H] and (H,∞), bounds each part using the probability bound g(t) ≤ 1 and the exponential tail bound respectively, and evaluates the exponential integral using integral_exp_mul_Ioi.

theorem layer_cake_expected_mse_from_tail_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : } (hn : 0 < n) (_hd : 0 < d) (X : Matrix (Fin n) (Fin d) ) (θstar : Fin d) (ε : ΩFin n) (θhat : ΩFin d) (k : ) (hk : 1 k) (hkd : 2 * k d) (σ : ) ( : 0 < σ) (hhat_sparse : ∀ (ω : Ω), {j : Fin d | θhat ω j 0}.card k) (hstar_sparse : {j : Fin d | θstar j 0}.card k) (hLS : ∀ (ω : Ω) (θ : Fin d), {j : Fin d | θ j 0}.card k → (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) (X.mulVec θstar + ε ω - X.mulVec θ) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec θ)) (hsubG : ∀ (v : Fin n), v ⬝ᵥ v 1∀ (s : ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ε ω ⬝ᵥ v)) μ (ω : Ω), Real.exp (s * ε ω ⬝ᵥ v) μ Real.exp (s ^ 2 * σ ^ 2 / 2)) (C_tail : ) (hC_tail : C_tail = (d.choose (2 * k)) * 6 ^ (2 * k)) (hC_ge_1 : 1 C_tail) (α : ) (hα_def : α = n / (32 * σ ^ 2)) (hα_pos : 0 < α) (H : ) :
0 H (ω : Ω), 1 / n * X.mulVec (θhat ω - θstar) ⬝ᵥ X.mulVec (θhat ω - θstar) μ H + C_tail * Real.exp (-(α * H)) * (1 / α)

Layer-cake integration with tail bound (Corollary 2.9 proof step).

This is the core measure-theoretic step in the proof of Corollary 2.9. Given the tail bound from Theorem 2.6: P(|X(θ̂ - θ*)|² > 4t) ≤ C_tail · exp(-t/(8σ²)) the layer-cake formula E[Z] = ∫₀^∞ P(Z > u) du yields, for any H ≥ 0: E[(1/n)|X(θ̂ - θ*)|²] ≤ H + C_tail · exp(-α·H) · (1/α) where α = n/(32σ²) and C_tail = C(d,2k) · 6^{2k}.

The proof applies the layer-cake formula (via layer_cake_exp_tail_to_integral_bound) to Z = (1/n)|X(θ̂-θ*)|², using sparse_ls_tail_bound from Theorem 2.6 to supply the exponential tail bound P(Z > u) ≤ C_tail · exp(-α·u).

theorem log_tail_const_bound (d k : ) (hk : 1 k) (hkd : 2 * k d) (C_tail : ) (_hC_tail : C_tail = (d.choose (2 * k)) * 6 ^ (2 * k)) (hC_ge_1 : 1 C_tail) :
C₀ > 0, Real.log C_tail + 1 C₀ * k * Real.log (Real.exp 1 * d / k)

Algebraic bound on log of binomial-times-power tail constant.

For C_tail = C(d,2k) · 6^{2k} with 1 ≤ k and 2k ≤ d, we have: (log C_tail + 1) ≤ C₀ · k · log(e·d/k) for some universal constant C₀.

This follows from the standard bound C(d,j) ≤ (ed/j)^j (Lemma 2.7), giving log C_tail ≤ 2k·log(ed/(2k)) + 2k·log 6 = 2k·log(3ed/k), and the fact that log(ed/k) ≥ 1 when ed/k ≥ e (which holds since d ≥ 2k ≥ 2).

theorem cor_2_9_expected_mse_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : } (hn : 0 < n) (hd : 0 < d) (X : Matrix (Fin n) (Fin d) ) (θstar : Fin d) (ε : ΩFin n) (θhat : ΩFin d) (k : ) (hk : 1 k) (hkd : 2 * k d) (σ : ) ( : 0 < σ) (hhat_sparse : ∀ (ω : Ω), {j : Fin d | θhat ω j 0}.card k) (hstar_sparse : {j : Fin d | θstar j 0}.card k) (hLS : ∀ (ω : Ω) (θ : Fin d), {j : Fin d | θ j 0}.card k → (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) (X.mulVec θstar + ε ω - X.mulVec θ) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec θ)) (hsubG : ∀ (v : Fin n), v ⬝ᵥ v 1∀ (s : ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ε ω ⬝ᵥ v)) μ (ω : Ω), Real.exp (s * ε ω ⬝ᵥ v) μ Real.exp (s ^ 2 * σ ^ 2 / 2)) :
C > 0, (ω : Ω), 1 / n * X.mulVec (θhat ω - θstar) ⬝ᵥ X.mulVec (θhat ω - θstar) μ C * σ ^ 2 * (k / n) * Real.log (Real.exp 1 * d / k)

Corollary 2.9 (Expected MSE bound for ℓ₀-constrained LS).

Under the assumptions of Theorem 2.6, the expected MSE satisfies: E[MSE(X θ̂)] ≲ σ²k log(ed/k)/n

This follows from integrating the tail bound (2.6) using the layer-cake formula.

Rigollet, Chapter 2, Corollary 2.9.