Documentation

Atlas.HighDimensionalStatistics.code.Chapter3.Thm_3_3_SubGaussian

Theorem 3.3: Oracle Inequality for Least Squares (Sub-Gaussian version) #

This file provides the sub-Gaussian version of Theorem 3.3 from High-Dimensional Statistics by Philippe Rigollet (MIT 18.657, 2015).

Statement #

Assume the general regression model Y = f + ε where ε ~ subG_n(σ²). Then for any δ ∈ (0,1), with probability at least 1 − δ, the least squares estimator θ̂_LS satisfies:

MSE(Φθ̂, f) ≤ inf_θ MSE(Φθ, f) + 64 σ²(r + log(1/δ)) / n

where MSE(Φθ, f) = (1/n)‖Φθ − f‖², r = rank(ΦᵀΦ), and the infimum is over all θ ∈ ℝ^M.

Proof approach #

The concentration bound on |Φ(θ̂ − θ̄)|² is derived from the sub-Gaussian assumption via subG_squared_norm_high_prob_bound (which uses Theorem 1.19). This requires the fundamental inequality |Φ(θ̂−θ̄)|² ≤ 2εᵀΦ(θ̂−θ̄), which is precisely thm_3_3_pythagorean_step from the misspecified case. The deterministic oracle inequality (thm_3_3_oracle_inequality_det) is then applied pointwise on the concentration event.

theorem Rigollet.Chapter3.thm_3_3_subgaussian_infimum {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n M : } (hn : 0 < n) (Φ : Matrix (Fin n) (Fin M) ) (f : Fin n) (ε : ΩFin n) (θhat : ΩFin M) (θbar : Fin M) (hProj : ∀ (v : Fin M), (f - Φ.mulVec θbar) ⬝ᵥ Φ.mulVec v = 0) (hLS : ∀ (ω : Ω) (θ : Fin M), (f + ε ω - Φ.mulVec (θhat ω)) ⬝ᵥ (f + ε ω - Φ.mulVec (θhat ω)) (f + ε ω - Φ.mulVec θ) ⬝ᵥ (f + ε ω - Φ.mulVec θ)) (σ : ) ( : 0 < σ) (r : ) (hr : r = (Φ.transpose * Φ).rank) (δ : ) (hδ_pos : 0 < δ) (hδ_le : δ 1) (hsubG : ∀ (v : Fin n), v ⬝ᵥ v 1∀ (s : ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * ε ω ⬝ᵥ v)) μ ENNReal.ofReal (Real.exp (s ^ 2 * σ ^ 2 / 2))) (hε_meas : ∀ (j : Fin n), Measurable fun (ω : Ω) => ε ω j) :
μ {ω : Ω | MSE' (Φ.mulVec (θhat ω)) f (⨅ (θ : Fin M), MSE' (Φ.mulVec θ) f) + 64 * σ ^ 2 * (r + Real.log (1 / δ)) / n} ENNReal.ofReal (1 - δ)

Theorem 3.3 (Sub-Gaussian version).

Under the general regression model Y = f + ε with ε ~ subG_n(σ²), the least squares estimator θ̂^LS satisfies, with probability at least 1 − δ:

MSE(Φθ̂, f) ≤ inf_θ MSE(Φθ, f) + 64 σ²(r + log(1/δ)) / n

where r = rank(ΦᵀΦ).

The concentration bound is derived internally from the sub-Gaussian noise assumption via subG_squared_norm_high_prob_bound, using the Pythagorean step to establish the fundamental inequality.