Theorem 3.3: Oracle Inequality for Least Squares (Sub-Gaussian version) #

This file provides the sub-Gaussian version of Theorem 3.3 from High-Dimensional Statistics by Philippe Rigollet (MIT 18.657, 2015).

Statement #

Assume the general regression model Y = f + ε where ε ~ subG_n(σ²). Then for any δ ∈ (0,1), with probability at least 1 − δ, the least squares estimator θ̂_LS satisfies:

MSE(Φθ̂, f) ≤ inf_θ MSE(Φθ, f) + 64 σ²(r + log(1/δ)) / n

where MSE(Φθ, f) = (1/n)‖Φθ − f‖², r = rank(ΦᵀΦ), and the infimum is over all θ ∈ ℝ^M.

Proof approach #

The concentration bound on |Φ(θ̂ − θ̄)|² is derived from the sub-Gaussian assumption via subG_squared_norm_high_prob_bound (which uses Theorem 1.19). This requires the fundamental inequality |Φ(θ̂−θ̄)|² ≤ 2εᵀΦ(θ̂−θ̄), which is precisely thm_3_3_pythagorean_step from the misspecified case. The deterministic oracle inequality (thm_3_3_oracle_inequality_det) is then applied pointwise on the concentration event.

source

theorem Rigollet.Chapter3.thm_3_3_subgaussian_infimum {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n M : ℕ} (hn : 0 < ↑n) (Φ : Matrix (Fin n) (Fin M) ℝ) (f : Fin n → ℝ) (ε : Ω → Fin n → ℝ) (θhat : Ω → Fin M → ℝ) (θbar : Fin M → ℝ) (hProj : ∀ (v : Fin M → ℝ), (f - Φ.mulVec θbar) ⬝ᵥ Φ.mulVec v = 0) (hLS : ∀ (ω : Ω) (θ : Fin M → ℝ), (f + ε ω - Φ.mulVec (θhat ω)) ⬝ᵥ (f + ε ω - Φ.mulVec (θhat ω)) ≤ (f + ε ω - Φ.mulVec θ) ⬝ᵥ (f + ε ω - Φ.mulVec θ)) (σ : ℝ) (hσ : 0 < σ) (r : ℕ) (hr : r = (Φ.transpose * Φ).rank) (δ : ℝ) (hδ_pos : 0 < δ) (hδ_le : δ ≤ 1) (hsubG : ∀ (v : Fin n → ℝ), v ⬝ᵥ v ≤ 1 → ∀ (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * ε ω ⬝ᵥ v)) ∂μ ≤ ENNReal.ofReal (Real.exp (s ^ 2 * σ ^ 2 / 2))) (hε_meas : ∀ (j : Fin n), Measurable fun (ω : Ω) => ε ω j) :

μ {ω : Ω | MSE' (Φ.mulVec (θhat ω)) f ≤ (⨅ (θ : Fin M → ℝ), MSE' (Φ.mulVec θ) f) + 64 * σ ^ 2 * (↑r + Real.log (1 / δ)) / ↑n} ≥ ENNReal.ofReal (1 - δ)

Theorem 3.3 (Sub-Gaussian version).

Under the general regression model Y = f + ε with ε ~ subG_n(σ²), the least squares estimator θ̂^LS satisfies, with probability at least 1 − δ:

MSE(Φθ̂, f) ≤ inf_θ MSE(Φθ, f) + 64 σ²(r + log(1/δ)) / n

where r = rank(ΦᵀΦ).

The concentration bound is derived internally from the sub-Gaussian noise assumption via subG_squared_norm_high_prob_bound, using the Pythagorean step to establish the fundamental inequality.

Documentation

Atlas.HighDimensionalStatistics.code.Chapter3.Thm_3_3_SubGaussian

Theorem 3.3: Oracle Inequality for Least Squares (Sub-Gaussian version) #

Statement #

Proof approach #