Theorem 3.3: Oracle Inequality for Least Squares (Sub-Gaussian version) #
This file provides the sub-Gaussian version of Theorem 3.3 from High-Dimensional Statistics by Philippe Rigollet (MIT 18.657, 2015).
Statement #
Assume the general regression model Y = f + ε where ε ~ subG_n(σ²). Then for any δ ∈ (0,1), with probability at least 1 − δ, the least squares estimator θ̂_LS satisfies:
MSE(Φθ̂, f) ≤ inf_θ MSE(Φθ, f) + 64 σ²(r + log(1/δ)) / n
where MSE(Φθ, f) = (1/n)‖Φθ − f‖², r = rank(ΦᵀΦ), and the infimum is over all θ ∈ ℝ^M.
Proof approach #
The concentration bound on |Φ(θ̂ − θ̄)|² is derived from the sub-Gaussian
assumption via subG_squared_norm_high_prob_bound (which uses Theorem 1.19).
This requires the fundamental inequality |Φ(θ̂−θ̄)|² ≤ 2εᵀΦ(θ̂−θ̄), which
is precisely thm_3_3_pythagorean_step from the misspecified case. The
deterministic oracle inequality (thm_3_3_oracle_inequality_det) is then
applied pointwise on the concentration event.
Theorem 3.3 (Sub-Gaussian version).
Under the general regression model Y = f + ε with ε ~ subG_n(σ²), the least squares estimator θ̂^LS satisfies, with probability at least 1 − δ:
MSE(Φθ̂, f) ≤ inf_θ MSE(Φθ, f) + 64 σ²(r + log(1/δ)) / n
where r = rank(ΦᵀΦ).
The concentration bound is derived internally from the sub-Gaussian noise
assumption via subG_squared_norm_high_prob_bound, using the Pythagorean
step to establish the fundamental inequality.