Documentation

Atlas.HighDimensionalStatistics.code.Chapter4.Problem_4_3

Frobenius norm squared #

The squared Frobenius norm of a matrix: ‖A‖_F² = Σᵢ Σⱼ (Aᵢⱼ)².

Instances For

    Nuclear norm (sum of singular values) #

    def Chapter4.Problem43.nuclearNorm {d T : } (S : SVD d T) :

    The nuclear norm (trace norm / Schatten-1 norm) of a matrix given its SVD, defined as the sum of singular values: ‖A‖_* = Σⱼ σⱼ.

    Instances For

      Soft-thresholding operator #

      Soft-thresholding: softThreshold x τ = max(x - τ, 0). Applied to singular values, this gives the proximal operator for the nuclear norm.

      Instances For
        def Chapter4.Problem43.softThreshMatrix {d T : } (S : SVD d T) (τ : ) :
        Matrix (Fin d) (Fin T)

        The soft-thresholded SVD matrix: Σⱼ max(σⱼ - τ, 0) · uⱼ · vⱼᵀ. This is the closed form of the nuclear norm proximal operator.

        Instances For

          Nuclear norm penalized objective and minimizer #

          def Chapter4.Problem43.IsNuclearNormMinimizer {d T : } (Y M : Matrix (Fin d) (Fin T) ) (S : SVD d T) (τ : ) (n : ) :

          IsNuclearNormMinimizer Y M S τ n states that the matrix M (with SVD S) minimizes the nuclear norm penalized objective (1/n)‖Y − M‖_F² + τ‖M‖_* over all matrices.

          Instances For

            Helper: |x| ≤ y → x² ≤ y² #

            theorem Chapter4.Problem43.sq_le_of_abs_le {x y : } (h : |x| y) :
            x ^ 2 y ^ 2

            Part 1: Oracle comparison for nuclear norm penalization #

            theorem Chapter4.Problem43.oracle_sv_error_sq (sigmaHat sigmaStar τ : ) (hperturb : |sigmaHat - sigmaStar| τ) :
            ((sigmaHat + sigmaStar) / 2 - sigmaStar) ^ 2 τ ^ 2 / 4

            Pointwise oracle error bound. Under Weyl perturbation |σ̂ - σ*| ≤ τ, the oracle value (σ̂ + σ*)/2 has squared error relative to σ*: ((σ̂ + σ*)/2 - σ*)² = ((σ̂ - σ*)/2)² ≤ τ²/4.

            theorem Chapter4.Problem43.support_oracle_frobenius_bound {r : } (σstar σhat : Fin r) (τ : ) (hperturb : ∀ (j : Fin r), |σhat j - σstar j| τ) (rankΘstar : ) (hrank : {j : Fin r | σstar j 0}.card rankΘstar) :
            j : Fin r with σstar j 0, ((σhat j + σstar j) / 2 - σstar j) ^ 2 rankΘstar * (τ ^ 2 / 4)

            Oracle Frobenius error bound (restricted to support). The sum of squared oracle errors over the support {j : σⱼ* ≠ 0} is bounded by rank(Θ*) · τ²/4.

            theorem Chapter4.Problem43.full_oracle_frobenius_bound {r : } (σstar σhat : Fin r) (τ : ) (hperturb : ∀ (j : Fin r), |σhat j - σstar j| τ) :
            j : Fin r, ((σhat j + σstar j) / 2 - σstar j) ^ 2 r * (τ ^ 2 / 4)

            Full oracle Frobenius error bound. The total sum of squared oracle errors over all r components is ≤ r · τ²/4.

            theorem Chapter4.Problem43.problem_4_3_part1 {d T : } {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (model : SubGaussianMatrixModel Ω d T μ) (S_star : SVD d T) (_hS_star : S_star.IsDecompOf model.Θstar) (rankΘstar : ) (_hrank : {j : Fin S_star.r | S_star.σval j 0}.card rankΘstar) :
            ∃ (C : ) (τ_val : ), 0 < C 0 < τ_val μ {ω : Ω | ∀ (XΘhat : Matrix (Fin d) (Fin T) ) (S_hat S_Y : SVD d T), S_Y.IsDecompOf (model.observed ω)IsNuclearNormMinimizer (model.observed ω) XΘhat S_hat τ_val 1frobSqNorm (XΘhat - model.Θstar) C * model.σsq * rankΘstar * (max d T)} 99 / 100

            Problem 4.3, Part 1 (Nuclear norm penalization MSE bound). Book statement (Rigollet, Problem 4.3.1): "Consider the multivariate regression model (4.1) and define Θ̂ as any solution to min_Θ { (1/n) ‖Y − XΘ‖F² + τ ‖XΘ‖* }. Show that there exists a choice of τ such that (1/n) ‖XΘ̂ − XΘ*‖_F² ≲ σ² rank(Θ*) (d ∨ T) / n with probability .99."

            Hint from book: Consider the oracle matrix M̄ = Σⱼ ((λ̂ⱼ + λⱼ*)/2) ûⱼv̂ⱼᵀ, where λⱼ* and λ̂ⱼ are the singular values of XΘ* and Y respectively.

            Proof sketch:

            1. By Lemma 4.2, ‖F(ω)‖_op ≤ τ w.h.p. for τ ~ σ√((d∨T)/n).
            2. Weyl's perturbation bound gives |λ̂ⱼ − λⱼ*| ≤ τ.
            3. Compare the nuclear norm minimizer against the oracle M̄.
            4. The oracle's Frobenius error is bounded by rank(Θ*) · τ²/4.
            5. The nuclear norm penalty difference is controlled by Weyl's bound.
            6. Combining: (1/n)‖XΘ̂−XΘ*‖_F² ≤ C · rank(Θ*) · τ² ≲ C · σ² · rank(Θ*) · (d∨T)/n.

            The core algebraic bounds (oracle_sv_error_sq, support_oracle_frobenius_bound) are fully proved above. The full oracle comparison requires Lemma 4.2's operator norm tail bound which depends on measure-theoretic tools beyond current Mathlib.

            Exercise status (sorry): This is an exercise left to the reader (Problem 4.3.1). Per formalization guidelines, exercises may retain sorry when the proof is too hard. Completing this proof would require: (a) An oracle comparison argument: comparing the nuclear norm minimizer against the oracle matrix M̄ = Σⱼ ((λ̂ⱼ + λⱼ*)/2) ûⱼv̂ⱼᵀ. (b) Wrapping the deterministic bound in Lemma 4.2's probabilistic tail bound (‖E‖_op ≤ τ with probability ≥ 0.99), which requires sub-Gaussian matrix concentration inequalities not yet in Mathlib. (c) Weyl's perturbation theorem for singular values (|λ̂ⱼ − λⱼ*| ≤ ‖E‖_op), also not formalized in Mathlib.

            Part 2: Closed form — soft-thresholding of singular values #

            theorem Chapter4.Problem43.problem_4_3_part2_closed_form (sigmaHat sigmaStar t : ) (_hnn_hat : 0 sigmaHat) (hnn_star : 0 sigmaStar) (ht : t = (sigmaHat - sigmaStar) / 2) :
            max (sigmaHat - t) 0 = (sigmaHat + sigmaStar) / 2

            Part 2 (scalar identity). The closed form for the nuclear-norm penalized estimator's singular values is given by soft-thresholding. When the threshold t = (σ̂ - σ*) / 2 (half the difference between observed and true singular values), we get max(σ̂ - t, 0) = (σ̂ + σ*) / 2.

            This is the key algebraic identity showing that the nuclear norm minimizer XΘ̂ = Σⱼ max(λ̂ⱼ - τ/2, 0) ûⱼv̂ⱼᵀ can be rewritten as XΘ̂ = Σⱼ ((λ̂ⱼ + λⱼ*)/2) ûⱼv̂ⱼᵀ, the average of the observed and true SVD reconstructions.

            theorem Chapter4.Problem43.problem_4_3_part2_vector_form {r : } (sigmaHat sigmaStar : Fin r) (hnn_hat : ∀ (j : Fin r), 0 sigmaHat j) (hnn_star : ∀ (j : Fin r), 0 sigmaStar j) (thresh : Fin r) (hthresh : ∀ (j : Fin r), thresh j = (sigmaHat j - sigmaStar j) / 2) (j : Fin r) :
            max (sigmaHat j - thresh j) 0 = (sigmaHat j + sigmaStar j) / 2

            Part 2 (vector form). Component-wise application of the scalar identity to all singular values simultaneously.

            theorem Chapter4.Problem43.problem_4_3_part2_minimizer_is_soft_threshold {d T : } (Y XΘhat : Matrix (Fin d) (Fin T) ) (τ : ) ( : 0 τ) (S_Y : SVD d T) (hS_Y : S_Y.IsDecompOf Y) (S_hat : SVD d T) (hmin : IsNuclearNormMinimizer Y XΘhat S_hat τ 1) :
            XΘhat = softThreshMatrix S_Y (τ / 2)

            Problem 4.3, Part 2 (Closed form for XΘ̂). Book statement (Rigollet, Problem 4.3.2): "Find a closed form for XΘ̂."

            The answer is: XΘ̂ = Σⱼ max(λ̂ⱼ − τ/2, 0) ûⱼv̂ⱼᵀ, where Y = Σⱼ λ̂ⱼ ûⱼv̂ⱼᵀ is the SVD of Y. That is, the nuclear-norm penalized estimator applies soft-thresholding to the singular values of Y at level τ/2.

            This follows from the KKT optimality conditions: the proximal operator of the nuclear norm (Schatten-1 norm) is precisely soft-thresholding of singular values. The proof requires convex subdifferential calculus not available in Mathlib.

            Exercise status (sorry): This is an exercise left to the reader (Problem 4.3.2). Per formalization guidelines, exercises may retain sorry when the proof is too hard. Completing this proof would require: (a) Convex subdifferential calculus for the nuclear norm ‖·‖* (Schatten-1 norm), specifically characterizing ∂‖M‖* in terms of the SVD of M. (b) KKT (Karush-Kuhn-Tucker) optimality conditions for the nuclear norm penalized objective, yielding the proximal operator. (c) The proximal operator identity prox_{τ‖·‖_*}(Y) = soft-threshold of singular values of Y at level τ. None of these tools are in Mathlib.