Atlas.HighDimensionalStatistics.code.Chapter4.Problem_4

IsNuclearNormMinimizer Y M S τ n states that the matrix M (with SVD S) minimizes the nuclear norm penalized objective (1/n)‖Y − M‖_F² + τ‖M‖_* over all matrices.

Instances For

Helper: |x| ≤ y → x² ≤ y² #

source

theorem Chapter4.Problem43.sq_le_of_abs_le {x y : ℝ} (h : |x| ≤ y) :

x ^ 2 ≤ y ^ 2

Part 1: Oracle comparison for nuclear norm penalization #

source

theorem Chapter4.Problem43.oracle_sv_error_sq (sigmaHat sigmaStar τ : ℝ) (hperturb : |sigmaHat - sigmaStar| ≤ τ) :

((sigmaHat + sigmaStar) / 2 - sigmaStar) ^ 2 ≤ τ ^ 2 / 4

Pointwise oracle error bound. Under Weyl perturbation |σ̂ - σ*| ≤ τ, the oracle value (σ̂ + σ*)/2 has squared error relative to σ*: ((σ̂ + σ*)/2 - σ*)² = ((σ̂ - σ*)/2)² ≤ τ²/4.

source

theorem Chapter4.Problem43.support_oracle_frobenius_bound {r : ℕ} (σstar σhat : Fin r → ℝ) (τ : ℝ) (hperturb : ∀ (j : Fin r), |σhat j - σstar j| ≤ τ) (rankΘstar : ℕ) (hrank : {j : Fin r | σstar j ≠ 0}.card ≤ rankΘstar) :

∑ j : Fin r with σstar j ≠ 0, ((σhat j + σstar j) / 2 - σstar j) ^ 2 ≤ ↑rankΘstar * (τ ^ 2 / 4)

Oracle Frobenius error bound (restricted to support). The sum of squared oracle errors over the support {j : σⱼ* ≠ 0} is bounded by rank(Θ*) · τ²/4.

source

theorem Chapter4.Problem43.full_oracle_frobenius_bound {r : ℕ} (σstar σhat : Fin r → ℝ) (τ : ℝ) (hperturb : ∀ (j : Fin r), |σhat j - σstar j| ≤ τ) :

∑ j : Fin r, ((σhat j + σstar j) / 2 - σstar j) ^ 2 ≤ ↑r * (τ ^ 2 / 4)

Full oracle Frobenius error bound. The total sum of squared oracle errors over all r components is ≤ r · τ²/4.

source

theorem Chapter4.Problem43.problem_4_3_part1 {d T : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (model : SubGaussianMatrixModel Ω d T μ) (S_star : SVD d T) (_hS_star : S_star.IsDecompOf model.Θstar) (rankΘstar : ℕ) (_hrank : {j : Fin S_star.r | S_star.σval j ≠ 0}.card ≤ rankΘstar) :

∃ (C : ℝ) (τ_val : ℝ), 0 < C ∧ 0 < τ_val ∧ μ {ω : Ω | ∀ (XΘhat : Matrix (Fin d) (Fin T) ℝ) (S_hat S_Y : SVD d T), S_Y.IsDecompOf (model.observed ω) → IsNuclearNormMinimizer (model.observed ω) XΘhat S_hat τ_val 1 → frobSqNorm (XΘhat - model.Θstar) ≤ C * model.σsq * ↑rankΘstar * ↑(max d T)} ≥ 99 / 100

Problem 4.3, Part 1 (Nuclear norm penalization MSE bound). Book statement (Rigollet, Problem 4.3.1): "Consider the multivariate regression model (4.1) and define Θ̂ as any solution to min_Θ { (1/n) ‖Y − XΘ‖F² + τ ‖XΘ‖* }. Show that there exists a choice of τ such that (1/n) ‖XΘ̂ − XΘ*‖_F² ≲ σ² rank(Θ*) (d ∨ T) / n with probability .99."

Hint from book: Consider the oracle matrix M̄ = Σⱼ ((λ̂ⱼ + λⱼ*)/2) ûⱼv̂ⱼᵀ, where λⱼ* and λ̂ⱼ are the singular values of XΘ* and Y respectively.

Proof sketch:

By Lemma 4.2, ‖F(ω)‖_op ≤ τ w.h.p. for τ ~ σ√((d∨T)/n).
Weyl's perturbation bound gives |λ̂ⱼ − λⱼ*| ≤ τ.
Compare the nuclear norm minimizer against the oracle M̄.
The oracle's Frobenius error is bounded by rank(Θ*) · τ²/4.
The nuclear norm penalty difference is controlled by Weyl's bound.
Combining: (1/n)‖XΘ̂−XΘ*‖_F² ≤ C · rank(Θ*) · τ² ≲ C · σ² · rank(Θ*) · (d∨T)/n.

The core algebraic bounds (oracle_sv_error_sq, support_oracle_frobenius_bound) are fully proved above. The full oracle comparison requires Lemma 4.2's operator norm tail bound which depends on measure-theoretic tools beyond current Mathlib.

Exercise status (sorry): This is an exercise left to the reader (Problem 4.3.1). Per formalization guidelines, exercises may retain sorry when the proof is too hard. Completing this proof would require: (a) An oracle comparison argument: comparing the nuclear norm minimizer against the oracle matrix M̄ = Σⱼ ((λ̂ⱼ + λⱼ*)/2) ûⱼv̂ⱼᵀ. (b) Wrapping the deterministic bound in Lemma 4.2's probabilistic tail bound (‖E‖_op ≤ τ with probability ≥ 0.99), which requires sub-Gaussian matrix concentration inequalities not yet in Mathlib. (c) Weyl's perturbation theorem for singular values (|λ̂ⱼ − λⱼ*| ≤ ‖E‖_op), also not formalized in Mathlib.

Part 2: Closed form — soft-thresholding of singular values #

source

theorem Chapter4.Problem43.problem_4_3_part2_closed_form (sigmaHat sigmaStar t : ℝ) (_hnn_hat : 0 ≤ sigmaHat) (hnn_star : 0 ≤ sigmaStar) (ht : t = (sigmaHat - sigmaStar) / 2) :

max (sigmaHat - t) 0 = (sigmaHat + sigmaStar) / 2

Part 2 (scalar identity). The closed form for the nuclear-norm penalized estimator's singular values is given by soft-thresholding. When the threshold t = (σ̂ - σ*) / 2 (half the difference between observed and true singular values), we get max(σ̂ - t, 0) = (σ̂ + σ*) / 2.

This is the key algebraic identity showing that the nuclear norm minimizer XΘ̂ = Σⱼ max(λ̂ⱼ - τ/2, 0) ûⱼv̂ⱼᵀ can be rewritten as XΘ̂ = Σⱼ ((λ̂ⱼ + λⱼ*)/2) ûⱼv̂ⱼᵀ, the average of the observed and true SVD reconstructions.

source

theorem Chapter4.Problem43.problem_4_3_part2_vector_form {r : ℕ} (sigmaHat sigmaStar : Fin r → ℝ) (hnn_hat : ∀ (j : Fin r), 0 ≤ sigmaHat j) (hnn_star : ∀ (j : Fin r), 0 ≤ sigmaStar j) (thresh : Fin r → ℝ) (hthresh : ∀ (j : Fin r), thresh j = (sigmaHat j - sigmaStar j) / 2) (j : Fin r) :

max (sigmaHat j - thresh j) 0 = (sigmaHat j + sigmaStar j) / 2

Part 2 (vector form). Component-wise application of the scalar identity to all singular values simultaneously.

source

theorem Chapter4.Problem43.problem_4_3_part2_minimizer_is_soft_threshold {d T : ℕ} (Y XΘhat : Matrix (Fin d) (Fin T) ℝ) (τ : ℝ) (hτ : 0 ≤ τ) (S_Y : SVD d T) (hS_Y : S_Y.IsDecompOf Y) (S_hat : SVD d T) (hmin : IsNuclearNormMinimizer Y XΘhat S_hat τ 1) :

XΘhat = softThreshMatrix S_Y (τ / 2)

Problem 4.3, Part 2 (Closed form for XΘ̂). Book statement (Rigollet, Problem 4.3.2): "Find a closed form for XΘ̂."

The answer is: XΘ̂ = Σⱼ max(λ̂ⱼ − τ/2, 0) ûⱼv̂ⱼᵀ, where Y = Σⱼ λ̂ⱼ ûⱼv̂ⱼᵀ is the SVD of Y. That is, the nuclear-norm penalized estimator applies soft-thresholding to the singular values of Y at level τ/2.

This follows from the KKT optimality conditions: the proximal operator of the nuclear norm (Schatten-1 norm) is precisely soft-thresholding of singular values. The proof requires convex subdifferential calculus not available in Mathlib.

Exercise status (sorry): This is an exercise left to the reader (Problem 4.3.2). Per formalization guidelines, exercises may retain sorry when the proof is too hard. Completing this proof would require: (a) Convex subdifferential calculus for the nuclear norm ‖·‖* (Schatten-1 norm), specifically characterizing ∂‖M‖* in terms of the SVD of M. (b) KKT (Karush-Kuhn-Tucker) optimality conditions for the nuclear norm penalized objective, yielding the proximal operator. (c) The proximal operator identity prox_{τ‖·‖_*}(Y) = soft-threshold of singular values of Y at level τ. None of these tools are in Mathlib.

Documentation