Schatten p-norm #
The Schatten p-norm of a matrix is the ℓ_p norm of its singular value vector. Special cases:
- p = 1: nuclear norm ‖A‖_* = Σⱼ σⱼ
- p = 2: Frobenius norm ‖A‖_F = (Σⱼ σⱼ²)^{1/2}
- p = ∞: operator norm ‖A‖_op = max_j σⱼ = σ₁
The nuclear norm (Schatten 1-norm): ‖A‖_* = Σⱼ σⱼ
Instances For
Helper lemmas for the Frobenius inner product #
Operator norm bound for bilinear forms #
The key ingredient: for unit vectors u, v, dotProduct u (B.mulVec v) ≤ ‖B‖_op by Cauchy-Schwarz and the operator norm bound.
Hölder's inequality for Schatten norms #
Hölder-Schatten inequalities #
The SVD structure in our formalization does not encode orthonormality of the singular vectors. The proofs below therefore include explicit hypotheses that the left/right singular vectors are L²-unit vectors. In a standard SVD, this is automatically satisfied.
The underlying proof strategy is:
- Substitute A = UΣVᵀ (the SVD decomposition)
- Distribute the Frobenius inner product over the SVD sum
- Bound each term σⱼ · ⟨uⱼ, B vⱼ⟩ using Cauchy-Schwarz + operator norm
- Sum to get ≤ (Σ σⱼ) · ‖B‖op = ‖A‖* · ‖B‖_op
References:
- Horn & Johnson, Matrix Analysis, 2nd ed., Theorem 7.4.1.1 (von Neumann)
- Bhatia, Matrix Analysis, Chapter IV, Theorem IV.2.5
- Rigollet, High Dimensional Statistics, Section 4.1 (stated without proof)
Helper infrastructure for the general Hölder-Schatten inequality #
Derive Real.HolderConjugate p q from the hypotheses 1 ≤ p, 1 ≤ q,
and 1/p + 1/q = 1.
Von Neumann's trace inequality.
For matrices A, B with SVDs S_A, S_B:
|⟨A, B⟩| ≤ Σⱼ σⱼ(A) · σⱼ(B)
where the singular values are extended to the common index Fin (min d T).
This fundamental result in matrix analysis is stated without proof in Rigollet's text and proved in:
- Horn & Johnson, Matrix Analysis, 2nd ed., Theorem 7.4.1.1
- Bhatia, Matrix Analysis, Chapter IV, Theorem IV.2.5
We introduce it as an axiom since the proof is not given in the textbook.
Hölder's inequality for Schatten norms (special case p=1, q=∞).
⟨A, B⟩ ≤ ‖A‖_* · ‖B‖_op
where ‖A‖_* is the nuclear (trace/Schatten-1) norm and ‖B‖_op is the operator (spectral/Schatten-∞) norm.
The hypotheses hu and hv assert that the SVD singular vectors are
L²-unit vectors, which is always the case for a proper SVD.
Proof: Let A = UΣVᵀ be the SVD. Then ⟨A, B⟩ = Σⱼ σⱼ · uⱼᵀBvⱼ ≤ Σⱼ σⱼ · ‖B‖op = ‖A‖* · ‖B‖_op, using uⱼᵀBvⱼ ≤ ‖B‖_op (by Cauchy-Schwarz and the operator norm bound).
Hölder's inequality for Schatten norms (general form).
For p, q ∈ [1, ∞] with 1/p + 1/q = 1: |⟨A, B⟩| ≤ ‖A‖_p · ‖B‖_q
where ‖·‖_p denotes the Schatten p-norm.
Proof sketch (not formalized): By von Neumann's trace inequality, |⟨A, B⟩| ≤ Σⱼ σⱼ(A) · σⱼ(B). Then apply the classical Hölder inequality for finite sequences to the singular value vectors: Σⱼ σⱼ(A) · σⱼ(B) ≤ (Σⱼ σⱼ(A)^p)^{1/p} · (Σⱼ σⱼ(B)^q)^{1/q} = ‖A‖_p · ‖B‖_q.
This general form uses von Neumann's trace inequality (introduced as a theorem since its proof is not given in the textbook) together with Mathlib's Hölder inequality for finite nonnegative sequences.