Atlas.HighDimensionalStatistics.code.Chapter4.Problem_4

Local Definitions #

Squared Frobenius norm: ‖M‖_F² = Σᵢⱼ Mᵢⱼ².

Instances For

noncomputable def Chapter4.Problem42.opNorm {m p : ℕ} (A : Matrix (Fin m) (Fin p) ℝ) :

Operator norm of a matrix (abstract placeholder). In the book this is the largest singular value ‖A‖_op = σ₁(A).

Instances For

theorem Chapter4.Problem42.problem_4_2_part1 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d T : ℕ} (hn : 0 < ↑n) (X : Matrix (Fin n) (Fin d) ℝ) (Θstar : Matrix (Fin d) (Fin T) ℝ) (σ : ℝ) (hσ : 0 < σ) (E Y : Ω → Matrix (Fin n) (Fin T) ℝ) (hY : ∀ (ω : Ω), Y ω = X * Θstar + E ω) (Mhat : ℝ → Ω → Matrix (Fin n) (Fin T) ℝ) (τ₀ : ℝ) (hτ₀_pos : 0 < τ₀) (hτ₀_bound : τ₀ ^ 2 ≤ σ ^ 2 * ↑(max d T)) (hLemma42 : μ {ω : Ω | opNorm (E ω) ≤ τ₀} ≥ ENNReal.ofReal 0.99) (C₀ : ℝ) (hC₀ : 0 < C₀) (hThm43 : ∀ (ω : Ω), opNorm (E ω) ≤ τ₀ → frobSq (Mhat τ₀ ω - X * Θstar) ≤ C₀ * ↑Θstar.rank * τ₀ ^ 2) (hMeas : MeasurableSet {ω : Ω | opNorm (E ω) ≤ τ₀}) :

∃ (C : ℝ) (τ : ℝ), 0 < C ∧ 0 < τ ∧ μ {ω : Ω | 1 / ↑n * frobSq (Mhat τ ω - X * Θstar) ≤ C * σ ^ 2 * ↑Θstar.rank * ↑(max d T) / ↑n} ≥ ENNReal.ofReal 0.99

Problem 4.2, Part 1 — SVD thresholding bound with probability ≥ 0.99.

The proof combines Lemma 4.2 (operator norm bound on noise) with Theorem 4.3 (deterministic SVT approximation bound). Given:

τ₀ is a threshold with τ₀² ≤ σ²(d ∨ T)
Lemma 4.2: μ {ω | ‖E(ω)‖_op ≤ τ₀} ≥ 0.99
Theorem 4.3 (deterministic): when ‖E(ω)‖_op ≤ τ₀, ‖M̂(τ₀,ω) − XΘ*‖_F² ≤ C₀ · rank(Θ*) · τ₀²

We combine these to show the probabilistic MSE bound.

theorem Chapter4.Problem42.problem_4_2_part2 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d T : ℕ} (hn : 0 < ↑n) (X : Matrix (Fin n) (Fin d) ℝ) (Θstar : Matrix (Fin d) (Fin T) ℝ) (σ : ℝ) (hσ : 0 < σ) (E Y : Ω → Matrix (Fin n) (Fin T) ℝ) (hY : ∀ (ω : Ω), Y ω = X * Θstar + E ω) (Mhat : ℝ → Ω → Matrix (Fin n) (Fin T) ℝ) (P : Matrix (Fin n) (Fin n) ℝ) (hP_proj : P * P = P) (hP_col : ∀ (Θ : Matrix (Fin d) (Fin T) ℝ), P * (X * Θ) = X * Θ) (hP_contract : ∀ (A : Matrix (Fin n) (Fin T) ℝ), frobSq (P * A - X * Θstar) ≤ frobSq (A - X * Θstar)) (ThetaHat : ℝ → Ω → Matrix (Fin d) (Fin T) ℝ) (hPM : ∀ (τ : ℝ) (ω : Ω), P * Mhat τ ω = X * ThetaHat τ ω) (τ₀ : ℝ) (hτ₀_pos : 0 < τ₀) (hτ₀_bound : τ₀ ^ 2 ≤ σ ^ 2 * ↑(max d T)) (hLemma42 : μ {ω : Ω | opNorm (E ω) ≤ τ₀} ≥ ENNReal.ofReal 0.99) (C₀ : ℝ) (hC₀ : 0 < C₀) (hThm43 : ∀ (ω : Ω), opNorm (E ω) ≤ τ₀ → frobSq (Mhat τ₀ ω - X * Θstar) ≤ C₀ * ↑Θstar.rank * τ₀ ^ 2) (hMeas : MeasurableSet {ω : Ω | opNorm (E ω) ≤ τ₀}) :

∃ (Q : Matrix (Fin n) (Fin n) ℝ) (Th : ℝ → Ω → Matrix (Fin d) (Fin T) ℝ), (∀ (τ : ℝ) (ω : Ω), Q * Mhat τ ω = X * Th τ ω) ∧ ∃ (C : ℝ) (τ : ℝ), 0 < C ∧ 0 < τ ∧ μ {ω : Ω | 1 / ↑n * frobSq (X * Th τ ω - X * Θstar) ≤ C * σ ^ 2 * ↑Θstar.rank * ↑(max d T) / ↑n} ≥ ENNReal.ofReal 0.99

Problem 4.2, Part 2 — There exists an n×n projection matrix P onto col(X) such that PM̂ = XΘ̂ for some Θ̂, and the same MSE rate holds.

The key ideas:

P is the orthogonal projector onto the column space of X (so P² = P, P(XΘ) = XΘ)
P is a contraction in Frobenius norm: ‖PA − PB‖_F ≤ ‖A − B‖_F
PM̂ lies in col(X), so PM̂ = XΘ̂ for some Θ̂
The MSE of XΘ̂ = PM̂ is no worse than the MSE of M̂ by the contraction property