Documentation

Atlas.HighDimensionalStatistics.code.Chapter4.Thm_4_3

Frobenius norm setup #

noncomputable def SingularValueThresholding.frobeniusNormSq {d T : } (A : Matrix (Fin d) (Fin T) ) :

The squared Frobenius norm of a matrix.

Instances For

    Algebraic helpers for singular value thresholding #

    theorem SingularValueThresholding.svt_S_implies_large {σ σhat τ : } (hperturb : |σhat - σ| τ) (hj : |σhat| > 2 * τ) :
    |σ| > τ

    If |σ̂| > 2τ and |σ̂ - σ| ≤ τ then |σ| > τ. This shows S ⊂ {j : |λ_j| > τ} (consequence of Weyl's inequality).

    theorem SingularValueThresholding.svt_Sc_implies_small {σ σhat τ : } (hperturb : |σhat - σ| τ) (hj : |σhat| 2 * τ) :
    |σ| 3 * τ

    If |σ̂| ≤ 2τ and |σ̂ - σ| ≤ τ then |σ| ≤ 3τ. This shows S^c ⊂ {j : |λ_j| ≤ 3τ}.

    theorem SingularValueThresholding.svt_term_bound {a b τ : } ( : 0 τ) (hab : |a - b| τ) :
    ((if |a| > 2 * τ then a else 0) - b) ^ 2 if b 0 then 9 * τ ^ 2 else 0

    Pointwise SVT error bound. For each singular value, the thresholding error ((thresholded σ̂) - σ)² is at most 9τ² if σ ≠ 0, and 0 if σ = 0.

    The four cases:

    • S ∩ supp(Θ*): |σ̂| > 2τ, σ ≠ 0 → (σ̂ - σ)² ≤ τ² ≤ 9τ²
    • S ∩ supp^c: |σ̂| > 2τ, σ = 0 → impossible (|σ̂| = |σ̂ - 0| ≤ τ < 2τ)
    • S^c ∩ supp: |σ̂| ≤ 2τ, σ ≠ 0 → σ² ≤ (3τ)² = 9τ²
    • S^c ∩ supp^c: |σ̂| ≤ 2τ, σ = 0 → (0 - 0)² = 0

    Core algebraic bound #

    theorem SingularValueThresholding.svt_singular_value_bound {r : } (σ σhat : Fin r) (τ : ) ( : 0 τ) (hperturb : ∀ (j : Fin r), |σhat j - σ j| τ) :
    j : Fin r, ((if |σhat j| > 2 * τ then σhat j else 0) - σ j) ^ 2 9 * {j : Fin r | σ j 0}.card * τ ^ 2

    Core algebraic bound for SVT (constant 9). Given singular value sequences σ (true) and σ̂ (observed) with pointwise perturbation |σ̂_j - σ_j| ≤ τ, the sum of squared SVT errors satisfies:

    Σⱼ ((thresholded σ̂_j) - σ_j)² ≤ 9 · #{j : σ_j ≠ 0} · τ²

    The constant 9 = 3² comes from the S^c ∩ supp bound where |σ_j| ≤ 3τ.

    Theorem 4.3: SVT Frobenius error bound #

    theorem SingularValueThresholding.theorem_4_3_svt_bound {r : } (σ σhat : Fin r) (τ : ) ( : 0 τ) (hperturb : ∀ (j : Fin r), |σhat j - σ j| τ) (rankΘstar : ) (hrank : {j : Fin r | σ j 0}.card rankΘstar) :
    j : Fin r, ((if |σhat j| > 2 * τ then σhat j else 0) - σ j) ^ 2 144 * rankΘstar * τ ^ 2

    Theorem 4.3 (algebraic form, constant 144). Given singular value sequences σ (true, from Θ*) and σ̂ (observed, from y = Θ* + F) with Weyl's perturbation bound |σ̂_j - σ_j| ≤ τ, the SVT estimator with threshold 2τ satisfies:

    Σⱼ ((thresholded σ̂_j) - σ_j)² ≤ 144 · rank(Θ*) · τ²

    where rank(Θ*) ≥ #{j : σ_j ≠ 0}.

    This is the bound stated in Theorem 4.3 of Rigollet. The constant 144 dominates the tighter constant 9 obtained from the direct pointwise analysis; in the book, the 144 arises from the matrix-level decomposition via the oracle Θ̄ and the inequality ‖A‖_F² ≤ rank(A) · ‖A‖_op².

    theorem SingularValueThresholding.theorem_4_3_inequality {d T : } {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (model : SubGaussianMatrixModel Ω d T μ) (τ : ) ( : 0 τ) (S_star : SVD d T) (_hS_star : S_star.IsDecompOf model.Θstar) (S_obs : SVD d T) (hr_eq : S_obs.r = S_star.r) (hWeyl : ∀ (j : Fin S_star.r), |S_obs.σval (Fin.cast j) - S_star.σval j| τ) (ω : Ω) (_hω : matrixOpNorm (model.F ω) τ) (_hobs : S_obs.IsDecompOf (model.observed ω)) :
    j : Fin S_star.r, ((if |S_obs.σval (Fin.cast j)| > 2 * τ then S_obs.σval (Fin.cast j) else 0) - S_star.σval j) ^ 2 144 * {j : Fin S_star.r | S_star.σval j 0}.card * τ ^ 2

    Theorem 4.3 (matrix-level statement). Consider the sub-Gaussian matrix model (4.2) with ORT assumption. Given:

    • SVD decompositions S_star of Θ* and S_obs of y = Θ* + F
    • Weyl's perturbation bound: |σ̂_j - σ_j| ≤ τ for all j (which follows from ‖F‖_op ≤ τ, guaranteed by Lemma 4.2 with probability 1-δ)

    The SVT estimator Θ̂^SVT = S_obs.svtMatrix τ satisfies the singular value bound:

    Σⱼ ((thresholded σ̂_j) - σ_j)² ≤ 144 · rank_bound · τ²

    Under orthonormal SVD vectors, the left-hand side equals ‖Θ̂^SVT - Θ*‖_F².

    The threshold τ from equation (4.3) is: τ = 4σ√(log(12)(d∨T)/n) + 2σ√(2log(1/δ)/n)

    giving the rate: σ² · rank(Θ*) · (d∨T + log(1/δ))/n.

    theorem SingularValueThresholding.theorem_4_3_rate {r : } (σ σhat : Fin r) (τ : ) ( : 0 τ) (hperturb : ∀ (j : Fin r), |σhat j - σ j| τ) {σsq : } (_hσ : 0 σsq) {d_max_T : } (hτ_bound : τ ^ 2 σsq * (d_max_T + 1)) (rankΘstar : ) (hrank : {j : Fin r | σ j 0}.card rankΘstar) :
    j : Fin r, ((if |σhat j| > 2 * τ then σhat j else 0) - σ j) ^ 2 144 * rankΘstar * (σsq * (d_max_T + 1))

    Theorem 4.3 (rate form). The SVT bound combined with the threshold choice (4.3) gives the rate:

    Σⱼ ((thresholded σ̂_j) - σ_j)² ≤ 144 · rank(Θ*) · τ²

    where τ² ≲ σ² · (d ∨ T + log(1/δ)) / n, yielding

    ‖Θ̂^SVT - Θ*‖_F² ≲ σ² · rank(Θ*) · (d ∨ T + log(1/δ)) / n

    Frobenius norm identities #

    theorem SingularValueThresholding.frobenius_norm_sq_eq_sum_entries {d T : } (A : Matrix (Fin d) (Fin T) ) :
    A ^ 2 = i : Fin d, j : Fin T, A i j ^ 2

    Bridge lemma: the squared Frobenius norm equals the entry-wise sum of squares.

    theorem SingularValueThresholding.frobenius_norm_sq_eq_sum_sq_sv_diff {d T r : } (u : Fin rFin d) (v : Fin rFin T) (hu_ortho : ∀ (i j : Fin r), u i ⬝ᵥ u j = if i = j then 1 else 0) (hv_ortho : ∀ (i j : Fin r), v i ⬝ᵥ v j = if i = j then 1 else 0) (α β : Fin r) :
    frobeniusNormSq (j : Fin r, α j Matrix.vecMulVec (u j) (v j) - j : Fin r, β j Matrix.vecMulVec (u j) (v j)) = j : Fin r, (α j - β j) ^ 2

    Frobenius norm and singular values (standard identity). For an orthonormal SVD A = Σⱼ σⱼ uⱼ vⱼᵀ with orthonormal {uⱼ} and {vⱼ}, and a matrix B = Σⱼ σ'ⱼ uⱼ vⱼᵀ expressed in the same orthonormal basis, the squared Frobenius norm of the difference equals the sum of squared differences of the coefficients:

    ‖A - B‖_F² = Σⱼ (σⱼ - σ'ⱼ)²

    This is the standard Parseval-type identity for orthonormal expansions. In the SVT context, both Θ̂^SVT and Θ* are expressed in the SVD basis of y, so this identity connects the algebraic singular-value bound to the Frobenius norm statement in the book.

    The proof requires showing that orthonormal rank-1 matrices {uⱼ vⱼᵀ} form an orthonormal system in the Frobenius inner product space. This is a standard linear algebra fact not directly available in Mathlib as a single lemma.

    theorem SingularValueThresholding.svt_frobenius_eq_sv_sum {d T : } (S_star S_obs : SVD d T) (hr_eq : S_obs.r = S_star.r) (hu_ortho : ∀ (i j : Fin S_star.r), S_obs.u (Fin.cast i) ⬝ᵥ S_obs.u (Fin.cast j) = if i = j then 1 else 0) (hv_ortho : ∀ (i j : Fin S_star.r), S_obs.v (Fin.cast i) ⬝ᵥ S_obs.v (Fin.cast j) = if i = j then 1 else 0) (hstar_basis : S_star.toMatrix = j : Fin S_star.r, S_star.σval j Matrix.vecMulVec (S_obs.u (Fin.cast j)) (S_obs.v (Fin.cast j))) (τ : ) :
    frobeniusNormSq (S_obs.svtMatrix τ - S_star.toMatrix) = j : Fin S_star.r, ((if |S_obs.σval (Fin.cast j)| > 2 * τ then S_obs.σval (Fin.cast j) else 0) - S_star.σval j) ^ 2

    Corollary: SVT Frobenius norm equals singular value sum. For an SVD decomposition with orthonormal singular vectors, the squared Frobenius norm ‖Θ̂^SVT - Θ*‖_F² equals Σⱼ ((thresholded σ̂ⱼ) - σⱼ)². This bridges the algebraic bound and the Frobenius norm statement.

    theorem SingularValueThresholding.frobenius_norm_sq_eq_sum {d T : } (A : Matrix (Fin d) (Fin T) ) :
    A ^ 2 = i : Fin d, j : Fin T, A i j ^ 2
    theorem SingularValueThresholding.ort_mulVec_norm_sq {d n : } (X : Matrix (Fin n) (Fin d) ) (hORT : X.transpose * X = n 1) (a : Fin d) :
    i : Fin n, X.mulVec a i ^ 2 = n * k : Fin d, a k ^ 2
    theorem SingularValueThresholding.ort_frobenius_equality {d T n : } (X : Matrix (Fin n) (Fin d) ) (hORT : X.transpose * X = n 1) (hn : n 0) (Θhat Θstar : Matrix (Fin d) (Fin T) ) :
    1 / n * frobeniusNormSq (X * Θhat - X * Θstar) = frobeniusNormSq (Θhat - Θstar)

    ORT equality (Assumption ORT). Under the orthonormal design assumption 𝕏ᵀ𝕏/n = I_d, we have for any estimator Θ̂:

    (1/n) ‖𝕏Θ̂ - 𝕏Θ*‖_F² = ‖Θ̂ - Θ*‖_F²

    This follows because 𝕏(Θ̂ - Θ*) has Frobenius norm: ‖𝕏(Θ̂-Θ*)‖_F² = tr((Θ̂-Θ*)ᵀ 𝕏ᵀ𝕏 (Θ̂-Θ*)) = n · tr((Θ̂-Θ*)ᵀ(Θ̂-Θ*)) = n · ‖Θ̂-Θ*‖_F²

    In the book, this is the identity stated in the first equality of Theorem 4.3. The ORT assumption reduces the prediction error to the estimation error.

    Theorem 4.3: Frobenius norm form (book statement) #

    theorem SingularValueThresholding.theorem_4_3_frobenius {d T : } {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (model : SubGaussianMatrixModel Ω d T μ) (τ : ) ( : 0 τ) (S_star : SVD d T) (_hS_star : S_star.IsDecompOf model.Θstar) (S_obs : SVD d T) (hr_eq : S_obs.r = S_star.r) (hWeyl : ∀ (j : Fin S_star.r), |S_obs.σval (Fin.cast j) - S_star.σval j| τ) (hu_ortho : ∀ (i j : Fin S_star.r), S_obs.u (Fin.cast i) ⬝ᵥ S_obs.u (Fin.cast j) = if i = j then 1 else 0) (hv_ortho : ∀ (i j : Fin S_star.r), S_obs.v (Fin.cast i) ⬝ᵥ S_obs.v (Fin.cast j) = if i = j then 1 else 0) (hstar_basis : S_star.toMatrix = j : Fin S_star.r, S_star.σval j Matrix.vecMulVec (S_obs.u (Fin.cast j)) (S_obs.v (Fin.cast j))) (ω : Ω) (_hω : matrixOpNorm (model.F ω) τ) (_hobs : S_obs.IsDecompOf (model.observed ω)) :
    frobeniusNormSq (S_obs.svtMatrix τ - S_star.toMatrix) 144 * {j : Fin S_star.r | S_star.σval j 0}.card * τ ^ 2

    Theorem 4.3 (Frobenius norm form). Under the sub-Gaussian matrix model (4.2) with ORT assumption, the SVT estimator satisfies:

    ‖Θ̂^SVT - Θ*‖_F² ≤ 144 · rank(Θ*) · τ²

    This is exactly the inequality stated in Theorem 4.3 of Rigollet's book. The proof combines:

    1. The SVT singular value bound (theorem_4_3_svt_bound)
    2. The Frobenius norm identity (svt_frobenius_eq_sv_sum)

    The first equality in the book's statement, (1/n)‖𝕏Θ̂^SVT - 𝕏Θ*‖_F² = ‖Θ̂^SVT - Θ*‖_F², follows from ort_frobenius_equality.

    theorem SingularValueThresholding.theorem_4_3 {d T n : } {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (model : SubGaussianMatrixModel Ω d T μ) (τ : ) ( : 0 τ) (S_star : SVD d T) (hS_star : S_star.IsDecompOf model.Θstar) (S_obs : SVD d T) (hr_eq : S_obs.r = S_star.r) (hWeyl : ∀ (j : Fin S_star.r), |S_obs.σval (Fin.cast j) - S_star.σval j| τ) (hu_ortho : ∀ (i j : Fin S_star.r), S_obs.u (Fin.cast i) ⬝ᵥ S_obs.u (Fin.cast j) = if i = j then 1 else 0) (hv_ortho : ∀ (i j : Fin S_star.r), S_obs.v (Fin.cast i) ⬝ᵥ S_obs.v (Fin.cast j) = if i = j then 1 else 0) (hstar_basis : S_star.toMatrix = j : Fin S_star.r, S_star.σval j Matrix.vecMulVec (S_obs.u (Fin.cast j)) (S_obs.v (Fin.cast j))) (ω : Ω) ( : matrixOpNorm (model.F ω) τ) (hobs : S_obs.IsDecompOf (model.observed ω)) (X : Matrix (Fin n) (Fin d) ) (hORT : X.transpose * X = n 1) (hn_pos : 0 < n) :
    1 / n * frobeniusNormSq (X * S_obs.svtMatrix τ - X * S_star.toMatrix) = frobeniusNormSq (S_obs.svtMatrix τ - S_star.toMatrix) frobeniusNormSq (S_obs.svtMatrix τ - S_star.toMatrix) 144 * {j : Fin S_star.r | S_star.σval j 0}.card * τ ^ 2

    Theorem 4.3 (Rigollet). Under the sub-Gaussian matrix model (4.2) with ORT assumption 𝕏ᵀ𝕏 = n·I_d, the SVT estimator with threshold 2τ satisfies:

    (1/n)‖𝕏Θ̂ˢⱽᵀ - 𝕏Θ*‖²_F = ‖Θ̂ˢⱽᵀ - Θ*‖²_F ≤ 144·rank(Θ*)·τ²

    This bundles both parts of the book statement:

    1. The ORT equality (1/n)‖𝕏Θ̂ - 𝕏Θ*‖²_F = ‖Θ̂ - Θ*‖²_F
    2. The Frobenius norm bound ‖Θ̂ˢⱽᵀ - Θ*‖²_F ≤ 144·rank(Θ*)·τ²

    The proof composes ort_frobenius_equality (the ORT identity) with theorem_4_3_frobenius (the Frobenius norm bound).

    Probabilistic wrapper (Theorem 4.3 full statement) #

    noncomputable def SingularValueThresholding.svtThreshold (σ : ) (n d T : ) (δ : ) :

    The SVT threshold from equation (4.3): τ = 4σ√(log(12)·(d∨T)/n) + 2σ√(2·log(1/δ)/n)

    Instances For
      theorem SingularValueThresholding.theorem_4_3_probabilistic {d T : } {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (model : SubGaussianMatrixModel Ω d T μ) (n : ) (σ δ : ) (_hδ_pos : 0 < δ) (_hδ_lt : δ < 1) (S_star : SVD d T) (_hS_star : S_star.IsDecompOf model.Θstar) (S_obs : ΩSVD d T) (hr_eq : ∀ (ω : Ω), (S_obs ω).r = S_star.r) (_hobs : ∀ (ω : Ω), (S_obs ω).IsDecompOf (model.observed ω)) (hWeyl : ∀ (ω : Ω), matrixOpNorm (model.F ω) svtThreshold σ n d T δ∀ (j : Fin S_star.r), |(S_obs ω).σval (Fin.cast j) - S_star.σval j| svtThreshold σ n d T δ) (hConc : μ {ω : Ω | matrixOpNorm (model.F ω) svtThreshold σ n d T δ} ENNReal.ofReal (1 - δ)) :
      μ {ω : Ω | j : Fin S_star.r, ((if |(S_obs ω).σval (Fin.cast j)| > 2 * svtThreshold σ n d T δ then (S_obs ω).σval (Fin.cast j) else 0) - S_star.σval j) ^ 2 144 * {j : Fin S_star.r | S_star.σval j 0}.card * svtThreshold σ n d T δ ^ 2} ENNReal.ofReal (1 - δ)

      Theorem 4.3 (full probabilistic statement). Under the sub-Gaussian matrix model (4.2), the SVT estimator with threshold τ = svtThreshold σ n d T δ satisfies

      Σⱼ ((thresholded σ̂_j) - σ_j)² ≤ 144 · rank_bound · τ²

      with probability at least 1 - δ. This wraps theorem_4_3 with the concentration event from Lemma 4.2.