Corollary 2.9: Expected MSE Bound for ℓ₀-Constrained Least Squares #

This file formalizes Corollary 2.9 from Rigollet's High-Dimensional Statistics.

Main result #

Under the assumptions of Theorem 2.6, the expected MSE of the ℓ₀-constrained least squares estimator satisfies:

E[MSE(X θ̂_{B₀(k)}^LS)] ≲ σ² k / n · log(ed/k)

Proof outline #

The proof uses the layer-cake formula to convert the tail bound from Theorem 2.6 (equation 2.6) into an expectation bound:

E[Z] = ∫₀^∞ P(Z > u) du ≤ H + ∫₀^∞ P(Z > u + H) du for any H ≥ 0
Apply the Thm 2.6 tail bound: P(Z > n·u) ≤ C_tail · exp(-α·u) where α = n/(32σ²) and C_tail = Σ_{j=1}^{2k} C(d,j) · 6^{2k}
Integrate: E[Z] ≤ H + C_tail · exp(-α·H) / α
Choose H = log(C_tail)/α so that C_tail · exp(-α·H) = 1
Conclude: E[Z] ≤ (log(C_tail) + 1) / α ≲ σ²k log(ed/k) / n

References #

[Rigollet, High-Dimensional Statistics], Corollary 2.9

Verification status #

All theorems in this file are fully proved (no sorry, no axiom). All transitive dependencies (including Thm_2_6 and Chapter 1 results) are also fully proved.

source

theorem layer_cake_choose_H (C_tail : ℝ) (hC : 1 ≤ C_tail) (α : ℝ) (hα : 0 < α) (expected_val : ℝ) (htail_int : ∀ (H : ℝ), 0 ≤ H → expected_val ≤ H + C_tail * Real.exp (-(α * H)) * (1 / α)) :

expected_val ≤ (Real.log C_tail + 1) / α

Layer-cake bound with exponential tail. If for every H ≥ 0, the expected value is bounded by H + C·exp(-αH)/α, then choosing H = log(C)/α (when C ≥ 1, α > 0) yields the bound (log(C) + 1)/α.

source

theorem cor_2_9_expected_mse (n d k : ℕ) (_hn : 0 < ↑n) (_hk : 1 ≤ k) (_hkd : 2 * k ≤ d) (σsq : ℝ) (_hσ : 0 < σsq) (α : ℝ) (hα : 0 < α) (_hα_def : α = ↑n / (32 * σsq)) (C_tail : ℝ) (hC : 1 ≤ C_tail) (expected_mse : ℝ) (htail_int : ∀ (H : ℝ), 0 ≤ H → expected_mse ≤ H + C_tail * Real.exp (-(α * H)) * (1 / α)) :

expected_mse ≤ (Real.log C_tail + 1) / α

Corollary 2.9 (Expected MSE bound for ℓ₀-constrained LS).

Under Theorem 2.6 assumptions, E[MSE(X θ̂)] ≲ σ²k log(ed/k)/n.

From Theorem 2.6 (equation 2.6): P(|X(θ̂ - θ*)|² > n·u) ≤ C_tail · exp(-n·u/(32σ²)) where C_tail = Σ_{j=1}^{2k} C(d,j) · 6^{2k}.

The layer-cake formula yields E[MSE] ≤ H + C_tail·exp(-αH)/α for any H ≥ 0, where α = n/(32σ²). Choosing H = log(C_tail)/α gives: E[MSE] ≤ (log(C_tail) + 1) / α

Since C_tail ≤ (ed/(2k))^{2k} · 6^{2k} (by Lemma 2.7), we have log(C_tail) ≲ 2k · log(ed/k), and 1/α = 32σ²/n, so E[MSE] ≲ σ²k log(ed/k) / n.

source

theorem layer_cake_exp_tail_to_integral_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (f : Ω → ℝ) (hf_nn : ∀ (ω : Ω), 0 ≤ f ω) (C α : ℝ) (hC : 0 ≤ C) (hα : 0 < α) (htail : ∀ (t : ℝ), 0 < t → μ {ω : Ω | f ω > t} ≤ ENNReal.ofReal (C * Real.exp (-(α * t)))) (H : ℝ) :

0 ≤ H → ∫ (ω : Ω), f ω ∂μ ≤ H + C * Real.exp (-(α * H)) * (1 / α)

Layer-cake formula with exponential tail bound (generic).

For a nonneg random variable f on a probability space, if the tail satisfies P(f > t) ≤ C · exp(-α·t) for all t > 0, then by the layer-cake formula E[f] = ∫₀^∞ P(f > t) dt, splitting at H ≥ 0: E[f] ≤ H + C · exp(-α·H) / α

The proof uses the layer-cake formula (Cavalieri's principle) from Mathlib (MeasureTheory.Integrable.integral_eq_integral_meas_lt), splits the integral at H into [0,H] and (H,∞), bounds each part using the probability bound g(t) ≤ 1 and the exponential tail bound respectively, and evaluates the exponential integral using integral_exp_mul_Ioi.

source

theorem layer_cake_expected_mse_from_tail_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : ℕ} (hn : 0 < n) (_hd : 0 < d) (X : Matrix (Fin n) (Fin d) ℝ) (θstar : Fin d → ℝ) (ε : Ω → Fin n → ℝ) (θhat : Ω → Fin d → ℝ) (k : ℕ) (hk : 1 ≤ k) (hkd : 2 * k ≤ d) (σ : ℝ) (hσ : 0 < σ) (hhat_sparse : ∀ (ω : Ω), {j : Fin d | θhat ω j ≠ 0}.card ≤ k) (hstar_sparse : {j : Fin d | θstar j ≠ 0}.card ≤ k) (hLS : ∀ (ω : Ω) (θ : Fin d → ℝ), {j : Fin d | θ j ≠ 0}.card ≤ k → (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) ≤ (X.mulVec θstar + ε ω - X.mulVec θ) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec θ)) (hsubG : ∀ (v : Fin n → ℝ), v ⬝ᵥ v ≤ 1 → ∀ (s : ℝ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ε ω ⬝ᵥ v)) μ ∧ ∫ (ω : Ω), Real.exp (s * ε ω ⬝ᵥ v) ∂μ ≤ Real.exp (s ^ 2 * σ ^ 2 / 2)) (C_tail : ℝ) (hC_tail : C_tail = ↑(d.choose (2 * k)) * 6 ^ (2 * k)) (hC_ge_1 : 1 ≤ C_tail) (α : ℝ) (hα_def : α = ↑n / (32 * σ ^ 2)) (hα_pos : 0 < α) (H : ℝ) :

0 ≤ H → ∫ (ω : Ω), 1 / ↑n * X.mulVec (θhat ω - θstar) ⬝ᵥ X.mulVec (θhat ω - θstar) ∂μ ≤ H + C_tail * Real.exp (-(α * H)) * (1 / α)

Layer-cake integration with tail bound (Corollary 2.9 proof step).

This is the core measure-theoretic step in the proof of Corollary 2.9. Given the tail bound from Theorem 2.6: P(|X(θ̂ - θ*)|² > 4t) ≤ C_tail · exp(-t/(8σ²)) the layer-cake formula E[Z] = ∫₀^∞ P(Z > u) du yields, for any H ≥ 0: E[(1/n)|X(θ̂ - θ*)|²] ≤ H + C_tail · exp(-α·H) · (1/α) where α = n/(32σ²) and C_tail = C(d,2k) · 6^{2k}.

The proof applies the layer-cake formula (via layer_cake_exp_tail_to_integral_bound) to Z = (1/n)|X(θ̂-θ*)|², using sparse_ls_tail_bound from Theorem 2.6 to supply the exponential tail bound P(Z > u) ≤ C_tail · exp(-α·u).

source

theorem log_tail_const_bound (d k : ℕ) (hk : 1 ≤ k) (hkd : 2 * k ≤ d) (C_tail : ℝ) (_hC_tail : C_tail = ↑(d.choose (2 * k)) * 6 ^ (2 * k)) (hC_ge_1 : 1 ≤ C_tail) :

∃ C₀ > 0, Real.log C_tail + 1 ≤ C₀ * ↑k * Real.log (Real.exp 1 * ↑d / ↑k)

Algebraic bound on log of binomial-times-power tail constant.

For C_tail = C(d,2k) · 6^{2k} with 1 ≤ k and 2k ≤ d, we have: (log C_tail + 1) ≤ C₀ · k · log(e·d/k) for some universal constant C₀.

This follows from the standard bound C(d,j) ≤ (ed/j)^j (Lemma 2.7), giving log C_tail ≤ 2k·log(ed/(2k)) + 2k·log 6 = 2k·log(3ed/k), and the fact that log(ed/k) ≥ 1 when ed/k ≥ e (which holds since d ≥ 2k ≥ 2).

source

theorem cor_2_9_expected_mse_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : ℕ} (hn : 0 < n) (hd : 0 < d) (X : Matrix (Fin n) (Fin d) ℝ) (θstar : Fin d → ℝ) (ε : Ω → Fin n → ℝ) (θhat : Ω → Fin d → ℝ) (k : ℕ) (hk : 1 ≤ k) (hkd : 2 * k ≤ d) (σ : ℝ) (hσ : 0 < σ) (hhat_sparse : ∀ (ω : Ω), {j : Fin d | θhat ω j ≠ 0}.card ≤ k) (hstar_sparse : {j : Fin d | θstar j ≠ 0}.card ≤ k) (hLS : ∀ (ω : Ω) (θ : Fin d → ℝ), {j : Fin d | θ j ≠ 0}.card ≤ k → (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec (θhat ω)) ≤ (X.mulVec θstar + ε ω - X.mulVec θ) ⬝ᵥ (X.mulVec θstar + ε ω - X.mulVec θ)) (hsubG : ∀ (v : Fin n → ℝ), v ⬝ᵥ v ≤ 1 → ∀ (s : ℝ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ε ω ⬝ᵥ v)) μ ∧ ∫ (ω : Ω), Real.exp (s * ε ω ⬝ᵥ v) ∂μ ≤ Real.exp (s ^ 2 * σ ^ 2 / 2)) :

∃ C > 0, ∫ (ω : Ω), 1 / ↑n * X.mulVec (θhat ω - θstar) ⬝ᵥ X.mulVec (θhat ω - θstar) ∂μ ≤ C * σ ^ 2 * (↑k / ↑n) * Real.log (Real.exp 1 * ↑d / ↑k)

Corollary 2.9 (Expected MSE bound for ℓ₀-constrained LS).

Under the assumptions of Theorem 2.6, the expected MSE satisfies: E[MSE(X θ̂)] ≲ σ²k log(ed/k)/n

This follows from integrating the tail bound (2.6) using the layer-cake formula.

Rigollet, Chapter 2, Corollary 2.9.

Documentation

Atlas.HighDimensionalStatistics.code.Chapter2.Cor_2_9

Corollary 2.9: Expected MSE Bound for ℓ₀-Constrained Least Squares #

Main result #

Proof outline #

References #

Verification status #