Problem 2.1: Ridge Regression #
From Rigollet Chapter 2, Problem 2.1.
Consider the linear regression model with fixed design with d ≤ n. The ridge regression estimator for parameter τ > 0 is defined as:
θ̂ᵣᵢᵈᵍᵉ = argmin_θ { (1/n)|Y - Xθ|₂² + τ|θ|₂² }
(a) Show that for any τ > 0, θ̂ᵣᵢᵈᵍᵉ is uniquely defined and give its closed form: θ̂ = (XᵀX + nτI)⁻¹ XᵀY
(b) Compute the bias of θ̂ᵣᵢᵈᵍᵉ and show that it is bounded in absolute value by |θ*|₂.
Problem 2.1(a): The ridge estimator is the unique minimizer of the ridge objective. For any τ > 0, the matrix XᵀX + nτI is positive definite (hence invertible), so θ̂ = (XᵀX + nτI)⁻¹ XᵀY is uniquely defined and minimizes (1/n)|Y - Xθ|₂² + τ|θ|₂² over all θ ∈ ℝᵈ.
Problem 2.1(b): The bias of the ridge estimator is bounded by |θ*|₂.
In the fixed design model Y = Xθ* + ε with E[ε] = 0: E[θ̂] = (XᵀX + nτI)⁻¹ XᵀX θ* so bias = E[θ̂] - θ* = ((XᵀX + nτI)⁻¹ XᵀX - I) θ*
The bias satisfies ‖E[θ̂] - θ*‖₂ ≤ ‖θ*‖₂. This follows because (XᵀX + nτI)⁻¹ XᵀX has all eigenvalues in [0,1], so I - (XᵀX + nτI)⁻¹ XᵀX has eigenvalues in [0,1] and is a contraction.