Atlas.HighDimensionalStatistics.code.Chapter4.Cor_4

Local definitions (inlined to avoid cross-file import chains) #

def Cor49.IsSpikedCovariance {d : ℕ} (S : Matrix (Fin d) (Fin d) ℝ) (θ : ℝ) (v : EuclideanSpace ℝ (Fin d)) :

Definition 4.7 (Spiked covariance model). A matrix S satisfies the spiked covariance model with parameter θ > 0 and spike v (a unit vector) if S = θ • (v · vᵀ) + I_d.

Instances For

source

noncomputable def Cor49.matrixOpNorm {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) :

ℝ

The operator norm of a matrix: ‖A‖_op = sup_{‖x‖₂=1} ‖Ax‖₂, defined via the continuous linear map norm on Euclidean spaces (L2/spectral norm).

Instances For

source

noncomputable def Cor49.empiricalCovariance {d n : ℕ} {Ω : Type u_1} (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (ω : Ω) :

Matrix (Fin d) (Fin d) ℝ

The empirical covariance matrix Σ̂ = (1/n) Σᵢ XᵢXᵢᵀ. Given n i.i.d. samples Xᵢ : Ω → ℝ^d (indexed by Fin n), this returns the d × d matrix (1/n) Σᵢ Xᵢ Xᵢᵀ at each ω.

Instances For

source

def Cor49.IsLargestEigenvector {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (w : EuclideanSpace ℝ (Fin d)) :

Prop

IsLargestEigenvector M w means w is a unit eigenvector of M corresponding to the largest eigenvalue: it has unit norm and maximizes the Rayleigh quotient wᵀ M w over all unit vectors.

Instances For

source

theorem Cor49.IsLargestEigenvector.norm_eq_one {d : ℕ} {M : Matrix (Fin d) (Fin d) ℝ} {w : EuclideanSpace ℝ (Fin d)} (h : IsLargestEigenvector M w) :

‖w‖ = 1

A largest eigenvector has unit norm.

source

def Cor49.IsSubGaussianVec {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) :

Prop

A random vector X : Ω → ℝ^d is sub-Gaussian with variance proxy σ² if for every unit vector u ∈ S^{d-1}, the scalar projection u⊤X is sub-Gaussian with variance proxy σ². Formally, this means the MGF of u⊤X is bounded: E[exp(s·u⊤X)] ≤ exp(σ²s²/2) for all s ∈ ℝ.

Instances For

source

noncomputable def Cor49.covRate (d₀ n : ℕ) (delta : ℝ) :

ℝ

The rate function appearing in the bound: max(√(t/n), t/n) where t = d + log(1/δ). This captures both the "slow" regime (t/n dominates) and the "fast" regime (√(t/n) dominates).

Instances For

Theorem 4.6 (operator norm concentration) #

The proof of Theorem 4.6 follows the book (Rigollet, §4.3):

WLOG reduction to Σ = I_d via whitening Y = Σ^{-1/2}X
ε-net argument (Lemma 4.2): ‖Σ̂_Y - I‖op ≤ 2 max{x,y∈N} |xᵀ(Σ̂_Y - I)y|
Polarization: (xᵀY)(yᵀY) = ((x+y)ᵀY)²/4 - ((x-y)ᵀY)²/4
Sub-exponential lifting (Lemma 1.12): products of sub-Gaussians → sub-exponential
Bernstein's inequality (Theorem 1.13): concentration for sub-exponential averages
Union bound over ≤ 144^d ε-net pairs → equation (4.7): P(‖Σ̂_Y - I‖_op > t) ≤ 144^d · exp[-n/2 · min((t/32)², t/32)]
Setting t = C · covRate d n δ to make failure probability ≤ δ.

The proof is structured in three layers:

Axiomatized sub-lemmas for the deepest sub-steps that reference Chapter 1 results and require Mathlib infrastructure not currently available (matrix square roots, ε-net covering numbers, sub-exponential Bernstein):
- eps_net_opnorm_reduction_L2: Lemmas 1.18 + 4.2
- per_pair_concentration_L2: Lemma 1.12 + Theorem 1.13
- whitened_vectors_exist_L2: Matrix square root whitening
Proved intermediate theorems built from the axiomatized sub-lemmas:
- eq_4_7_identity_L2: equation (4.7) via union bound
- whitening_reduction_L2: general → identity case reduction
Fully proved assembly (combining whitening with eq 4.7, then algebraic rate simplification).

source

noncomputable def Cor49.opNorm_concentration_C₁ :

ℝ

Universal constant from Theorem 4.6 (operator norm concentration).

Instances For

source

theorem Cor49.opNorm_concentration_C₁_pos :

0 < opNorm_concentration_C₁

The constant from Theorem 4.6 is positive.

Helper lemmas for sub-steps of Theorem 4.6 #

These theorems capture the intermediate results whose proofs are given in the book (Rigollet §4.3, using Lemmas 1.12, 1.18, 4.2, Theorem 1.13) but require Mathlib infrastructure not currently available:

ε-net covering numbers for the L2 unit sphere (Lemma 1.18)
ε-net reduction for operator norm (Lemma 4.2)
Sub-exponential moment bounds for products of sub-Gaussians (Lemma 1.12)
Bernstein's inequality for sub-exponential averages (Theorem 1.13)
Integration of matrix-transformed random vectors for the whitening step

Note: While Mathlib now has CFC.sqrt for matrix square roots (via the continuous functional calculus), the whitening step additionally requires showing that Σ^{-1/2}X preserves sub-Gaussian properties and transforms the covariance to identity, which involves Bochner integration of matrix-transformed random vectors — infrastructure not yet available.

These are axiomatized because the sub-steps require the above infrastructure.

Sub-lemmas for the proof of equation (4.7) #

These sub-lemmas correspond to the external references used in the proof of Theorem 4.6 in the book:

eps_net_opnorm_reduction_L2: Lemmas 1.18 + 4.2 (ε-net covering + reduction) — PROVED
per_pair_concentration_L2: Lemma 1.12 + Theorem 1.13 (sub-exponential Bernstein) — proved from helper axiom bernstein_bilinear_concentration
whitened_vectors_exist_L2: Whitening Y = Σ^{-1/2}X — proved from helper axiom matrix_square_root_whitening

The helper axioms correspond to specific external references in the book's proof:

bernstein_bilinear_concentration (axiom) captures the application of Lemma 1.12 + Theorem 1.13 from Chapter 1 (formalized in Rigollet.Chapter1.Lemma_1_12 and Rigollet.Chapter1.Thm_1_13) but requires type bridging infrastructure
matrix_square_root_whitening captures the spectral theorem application for constructing Σ^{-1/2} (standard linear algebra, not fully available in Mathlib)

source

theorem Cor49.eps_net_opnorm_reduction_L2 {d : ℕ} (hd : 0 < d) :

∃ (N : Finset (EuclideanSpace ℝ (Fin d))), ↑N.card ≤ 12 ^ d ∧ N.Nonempty ∧ (∀ x ∈ N, ‖x‖ ≤ 1) ∧ ∀ (A : Matrix (Fin d) (Fin d) ℝ) (t : ℝ), matrixOpNorm A > t → ∃ x ∈ N, ∃ y ∈ N, |(EuclideanSpace.equiv (Fin d) ℝ) x ⬝ᵥ A.mulVec ((EuclideanSpace.equiv (Fin d) ℝ) y)| > t / 2

ε-net reduction for L2/spectral operator norm (Lemmas 1.18 + 4.2). There exists a 1/4-net N for S^{d-1} with |N| ≤ 12^d such that ‖A‖_op > t implies some pair x,y ∈ N has |xᵀAy| > t/2.

Uses matrixOpNorm = ‖(toEuclideanLin A).toContinuousLinearMap‖ (the spectral/L2 operator norm) and works over EuclideanSpace ℝ (Fin d).

Proved using lemma_1_18_covering_number_euclidean_ball (Lemma 1.18) and lemma_4_2_eps_net_reduction (Lemma 4.2).

Type bridge: EuclideanSpace ℝ (Fin d) ↔ Fin d → ℝ #

The definitions in `Thm_4_6` use `Fin d → ℝ` while those in `Cor49` use
`EuclideanSpace ℝ (Fin d) = WithLp 2 (Fin d → ℝ)`.  The coordinate
access `Y i ω j` is the same in both worlds, so the `empiricalCovariance`
and `IsSubGaussianVec` definitions agree definitionally.

source

def Cor49.toPlainVec {d n : ℕ} {Ω : Type u_1} (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)) :

Fin n → Ω → Fin d → ℝ

Strip the WithLp wrapper to convert EuclideanSpace vectors to plain vectors.

Instances For

source

theorem Cor49.empiricalCovariance_eq_thm46 {d n : ℕ} {Ω : Type u_1} (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (ω : Ω) :

empiricalCovariance Y ω = Rigollet.Chapter4.Thm_4_6.empiricalCovariance (toPlainVec Y) ω

The Cor49 and Thm_4_6 empirical covariances agree (definitionally).

source

theorem Cor49.isSubGaussianVec_bridge {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) (Z : Ω → EuclideanSpace ℝ (Fin d)) :

IsSubGaussianVec μ Z 1 → Rigollet.Chapter4.Thm_4_6.IsSubGaussianVec μ (fun (ω : Ω) (j : Fin d) => (Z ω).ofLp j) 1

Bridge between Cor49 and Thm_4_6 sub-Gaussian definitions. After aligning both definitions to use EuclideanSpace ℝ (Fin d) for the test vector, the only remaining difference is the random vector type: Cor49 uses Ω → EuclideanSpace ℝ (Fin d), Thm_4_6 uses Ω → Fin d → ℝ. The MGF bounds are identical (both use ∑ j, u j * X ω j), so the bridge reduces to showing that AEMeasurable transfers through WithLp.equiv.

source

theorem Cor49.norm_plain_le_euclidean {d : ℕ} (x : EuclideanSpace ℝ (Fin d)) (hx : ‖x‖ ≤ 1) :

‖fun (j : Fin d) => x.ofLp j‖ ≤ 1

The sup norm of a plain vector is ≤ the L2 norm of the EuclideanSpace vector.

source

theorem Cor49.subGaussianVec_aestronglyMeasurable {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) (hSubG : IsSubGaussianVec μ X σsq) :

MeasureTheory.AEStronglyMeasurable X μ

Sub-Gaussian random vectors are AEStronglyMeasurable. This is a standard regularity assumption on random vectors; the book assumes random vectors are measurable without explicit proof.

source

theorem Cor49.iIndepFun_to_pairwise_uncorr {d n : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (hIndep : ProbabilityTheory.iIndepFun Y μ) (hMeas : ∀ (i : Fin n), MeasureTheory.AEStronglyMeasurable (Y i) μ) (i j : Fin n) :

i ≠ j → ∀ (a b : Fin d), ∫ (ω : Ω), (Y i ω).ofLp a * (Y j ω).ofLp b ∂μ = (∫ (ω : Ω), (Y i ω).ofLp a ∂μ) * ∫ (ω : Ω), (Y j ω).ofLp b ∂μ

Independence (iIndepFun) implies pairwise uncorrelation of coordinates. This is a standard mathematical fact: if X_i are mutually independent random vectors, then for i ≠ j, E[X_i(a) * X_j(b)] = E[X_i(a)] * E[X_j(b)]. Requires AEStronglyMeasurable random vectors (standard for measurable random variables).

source

theorem Cor49.bernstein_bilinear_concentration {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (Y i ω).ofLp j * (Y i ω).ofLp k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndep : ProbabilityTheory.iIndepFun Y μ) (x y : EuclideanSpace ℝ (Fin d)) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) (hs : 0 < s) :

μ {ω : Ω | |(EuclideanSpace.equiv (Fin d) ℝ) x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec ((EuclideanSpace.equiv (Fin d) ℝ) y)| > s} ≤ ENNReal.ofReal (2 * Real.exp (-(↑n / 2 * min ((s / 16) ^ 2) (s / 16))))

Bernstein concentration for bilinear form x^T(Σ̂-I)y.

For iid 1-sub-Gaussian vectors Y₁,...,Yₙ with identity covariance, and unit vectors x,y: P(|x^T(Σ̂-I)y| > s) ≤ 2·exp(-n/2 · min((s/16)², s/16)).

Proved by bridging types to Thm_4_6.per_pair_concentration.

source

theorem Cor49.per_pair_concentration_L2 {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (Y i ω).ofLp j * (Y i ω).ofLp k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndep : ProbabilityTheory.iIndepFun Y μ) (x y : EuclideanSpace ℝ (Fin d)) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) (hs : 0 < s) :

Per-pair bilinear form concentration (Polarization + Lemma 1.12 + Theorem 1.13). For the empirical covariance with identity population covariance, the bilinear form xᵀ(Σ̂-I)y concentrates via sub-exponential Bernstein.

Proved from bernstein_bilinear_concentration.

source

def Cor49.applyMatrix {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (x : EuclideanSpace ℝ (Fin d)) :

EuclideanSpace ℝ (Fin d)

Apply a matrix M to an EuclideanSpace vector, returning an EuclideanSpace vector.

Instances For

source

theorem Cor49.applyMatrix_apply {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (x : EuclideanSpace ℝ (Fin d)) (j : Fin d) :

(applyMatrix M x).ofLp j = ∑ i : Fin d, M j i * x.ofLp i

Component-wise expansion of applyMatrix: the j-th component of M·x is ∑ i, M j i * x i.

Sub-axioms for the whitening construction #

The book (Rigollet §4.3) constructs Y = Σ^{-1/2}X using the spectral decomposition Σ = UDUᵀ, giving Σ^{-1/2} = UD^{-1/2}Uᵀ. The whitening reduction requires:

Sub-Gaussian transfer: Linear maps preserve sub-Gaussian properties with operator norm scaling (MX ~ subG_d(‖M‖²·σ²))
Sub-Gaussian integrability: Sub-Gaussian vectors have integrable products
Bochner integration of bilinear forms: ∫(Mf)(Mf)ᵀ = M(∫ffᵀ)Mᵀ
Operator norm sub-multiplicativity: ‖MAMᵀ‖ ≤ ‖M‖²·‖A‖ (proved)
Empirical covariance transforms under linear maps: Σ̂_Y = MΣ̂_XMᵀ (proved)

The overall whitening reduction (matrix_square_root_whitening) is axiomatized because the book references the spectral theorem without providing a proof. Items 1-3 are also axiomatized; items 4-5 are proved from Mathlib.

NOTE: The original whitening_matrix_exists axiom (existence of Σ^{-1/2} with specific norm bounds) was mathematically false — ‖Σ^{-1/2}‖²·‖Σ‖ = κ(Σ) ≥ 1 for PD matrices, violating the stated bound ‖M‖²·‖Σ‖ ≤ 1. It has been replaced by the correctly-stated matrix_square_root_whitening axiom.

source

theorem Cor49.sum_eq_inner {d : ℕ} (u x : EuclideanSpace ℝ (Fin d)) :

∑ j : Fin d, u.ofLp j * x.ofLp j = inner ℝ u x

The component-wise sum ∑ u_j * x_j equals the inner product ⟪u, x⟫ in EuclideanSpace.

source

theorem Cor49.inner_applyMatrix_eq_inner_transpose {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (θ x : EuclideanSpace ℝ (Fin d)) :

inner ℝ θ (applyMatrix M x) = inner ℝ (applyMatrix M.transpose θ) x

The inner product ⟪θ, M x⟩ equals ⟪Mᵀ θ, x⟩ in EuclideanSpace.

source

theorem Cor49.applyMatrix_eq_toEuclideanLin {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (x : EuclideanSpace ℝ (Fin d)) :

applyMatrix M x = (Matrix.toEuclideanLin M) x

applyMatrix M x = (toEuclideanLin M) x, linking our definition to Mathlib.

source

theorem Cor49.matrixOpNorm_eq_norm {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) :

matrixOpNorm M = ‖M‖

matrixOpNorm M = ‖M‖ under the L2 operator norm.

source

theorem Cor49.real_transpose_eq_conjTranspose {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) :

M.transpose = M.conjTranspose

For real matrices, transpose = conjTranspose.

source

theorem Cor49.norm_applyMatrix_transpose_le {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (θ : EuclideanSpace ℝ (Fin d)) :

‖applyMatrix M.transpose θ‖ ≤ matrixOpNorm M * ‖θ‖

The norm of applyMatrix Mᵀ θ is at most matrixOpNorm M * ‖θ‖.

source

theorem Cor49.sum_applyMatrix_eq {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (u x : EuclideanSpace ℝ (Fin d)) :

∑ j : Fin d, u.ofLp j * (applyMatrix M x).ofLp j = ∑ j : Fin d, (applyMatrix M.transpose u).ofLp j * x.ofLp j

The sum ∑ u_j * (MX)_j equals ∑ (Mᵀu)_j * X_j, proved via the inner product identity.

source

theorem Cor49.subgaussian_linear_transform {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (M : Matrix (Fin d) (Fin d) ℝ) (σsq τsq : ℝ) :

IsSubGaussianVec μ X σsq → 0 ≤ σsq → matrixOpNorm M ^ 2 * σsq ≤ τsq → IsSubGaussianVec μ (fun (ω : Ω) => applyMatrix M (X ω)) τsq

Sub-Gaussian preservation under linear maps (reference: standard, §1). If X ~~subG_d(σ²) and M is a d×d matrix, then MX~~ subG_d(τ²) whenever ‖M‖²·σ² ≤ τ². The book uses this with M = Σ^{-1/2} to get Y ~ subG_d(1).

source

theorem Cor49.sq_le_exp_add_exp_neg (x : ℝ) :

x ^ 2 ≤ Real.exp x + Real.exp (-x)

For any real x, x² ≤ exp(x) + exp(-x). This follows from exp(x) + exp(-x) = 2·cosh(x) ≥ 2·(1 + x²/2) = 2 + x².

source

theorem Cor49.subgaussian_component_memLp_two {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) :

IsSubGaussianVec μ X σsq → ∀ (a : Fin d), MeasureTheory.AEStronglyMeasurable (fun (ω : Ω) => (X ω).ofLp a) μ → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp ((X ω).ofLp a)) μ → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (-(X ω).ofLp a)) μ → MeasureTheory.MemLp (fun (ω : Ω) => (X ω).ofLp a) 2 μ

Sub-Gaussian components are in L² (reference: sub-Gaussian moment bounds). For a sub-Gaussian random vector X with variance proxy σ², each component X_a is square-integrable (MemLp 2). This follows from the MGF bound: projecting onto the standard basis vector e_a gives E[exp(s·X_a)] ≤ exp(σ²s²/2), from which E[X_a²] < ∞ follows by standard moment bound arguments (e.g., x² ≤ exp(x) + exp(-x) combined with the MGF bound at s = ±1). The measurability and integrability hypotheses make explicit the implicit assumption that X is a random vector (i.e., measurable) with finite MGF, which the book takes for granted.

source

theorem Cor49.subgaussian_product_integrable {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) :

IsSubGaussianVec μ X σsq → ∀ (a b : Fin d), MeasureTheory.AEStronglyMeasurable (fun (ω : Ω) => (X ω).ofLp a) μ → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp ((X ω).ofLp a)) μ → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (-(X ω).ofLp a)) μ → MeasureTheory.AEStronglyMeasurable (fun (ω : Ω) => (X ω).ofLp b) μ → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp ((X ω).ofLp b)) μ → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (-(X ω).ofLp b)) μ → MeasureTheory.Integrable (fun (ω : Ω) => (X ω).ofLp a * (X ω).ofLp b) μ

Sub-Gaussian integrability (proved from sub-Gaussian moment bounds). Sub-Gaussian random vectors have all moments finite, so component products are integrable. This follows from the L² membership of each component (via subgaussian_component_memLp_two) and Hölder's inequality (L² × L² → L¹, i.e., MemLp.integrable_mul). The measurability and integrability hypotheses make explicit assumptions that the book takes for granted (X is a random vector with finite MGF).

source

theorem Cor49.bochner_bilinear_transform {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (Sig M : Matrix (Fin d) (Fin d) ℝ) (hInt : ∀ (a b : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => (X ω).ofLp a * (X ω).ofLp b) μ) :

(∀ (j k : Fin d), ∫ (ω : Ω), (X ω).ofLp j * (X ω).ofLp k ∂μ = Sig j k) → ∀ (j k : Fin d), ∫ (ω : Ω), (applyMatrix M (X ω)).ofLp j * (applyMatrix M (X ω)).ofLp k ∂μ = (M * Sig * M.transpose) j k

Bochner integral of bilinear forms under linear maps (proved from Bochner integration). If E[X_j·X_k] = Σ_{jk} and the component products are integrable, then for Y = MX, E[Y_j·Y_k] = (MΣMᵀ)_{jk}. Captures the identity ∫(Mf)(Mf)ᵀdμ = M(∫ffᵀdμ)Mᵀ which requires linearity of Bochner integration for bilinear/quadratic forms.

source

theorem Cor49.opnorm_conj_bound {d : ℕ} (M A : Matrix (Fin d) (Fin d) ℝ) :

matrixOpNorm (M * A * M.transpose) ≤ matrixOpNorm M ^ 2 * matrixOpNorm A

Operator norm sub-multiplicativity under conjugation (reference: standard linear algebra). For any matrices M, A: ‖MAMᵀ‖_op ≤ ‖M‖_op² · ‖A‖_op.

source

theorem Cor49.apply_outer_product {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) (x : EuclideanSpace ℝ (Fin d)) :

(Matrix.of fun (j k : Fin d) => (applyMatrix M x).ofLp j * (applyMatrix M x).ofLp k) = (M * Matrix.of fun (j k : Fin d) => x.ofLp j * x.ofLp k) * M.transpose

Per-sample outer product transforms under linear maps: (MX)(MX)ᵀ = M(XXᵀ)Mᵀ.

source

theorem Cor49.empirical_covariance_linear_map {d n : ℕ} {Ω : Type u_1} (M : Matrix (Fin d) (Fin d) ℝ) (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (ω : Ω) :

empiricalCovariance (fun (i : Fin n) (ω : Ω) => applyMatrix M (X i ω)) ω = M * empiricalCovariance X ω * M.transpose

Empirical covariance transforms under linear maps (reference: algebraic identity). If Y_i = MX_i pointwise, then Σ̂_Y = MΣ̂_XMᵀ. This is the sample-level version of the Bochner identity, holding for each ω.

source

theorem Cor49.subG_integrability_covariance {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {d : ℕ} (hd : 0 < d) (Sig : Matrix (Fin d) (Fin d) ℝ) (hSig : Sig.PosSemidef) (X : Ω → EuclideanSpace ℝ (Fin d)) (hSubG : IsSubGaussianVec μ X (matrixOpNorm Sig)) (hXm : ∀ (i : Fin d), MeasureTheory.AEStronglyMeasurable (fun (ω : Ω) => (X ω).ofLp i) μ) (hExpPos : ∀ (i : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp ((X ω).ofLp i)) μ) (hExpNeg : ∀ (i : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (-(X ω).ofLp i)) μ) (a b : Fin d) :

MeasureTheory.Integrable (fun (ω : Ω) => (X ω).ofLp a * (X ω).ofLp b) μ

Sub-Gaussian integrability for covariance estimation (standard mathematical fact). If X is a sub-Gaussian random vector with variance proxy σ², then for any coordinates a, b, the product X_a · X_b is integrable. This follows from the fact that sub-Gaussian random variables have finite moments of all orders: the MGF bound implies E[X_a²] < ∞ for each coordinate (via subgaussian_component_memLp_two), and then Cauchy-Schwarz / Hölder gives E[|X_a · X_b|] < ∞ (via MemLp.integrable_mul).

The measurability and exponential integrability hypotheses make explicit standard assumptions the book takes for granted (random vectors are always assumed measurable with finite MGF). These are needed because the local IsSubGaussianVec definition (MGF bound on Bochner integrals) does not itself imply measurability.

LDL-based whitening matrix helpers #

source

theorem Cor49.piLp_inner_eq_dotProduct {d : ℕ} (u v : Fin d → ℝ) :

inner ℝ (WithLp.toLp 2 u) (WithLp.toLp 2 v) = u ⬝ᵥ v

The inner product on WithLp 2 (Fin d → ℝ) equals dotProduct.

source

theorem Cor49.row_ne_zero_of_invertible {d : ℕ} (M : Matrix (Fin d) (Fin d) ℝ) [Invertible M] (i : Fin d) :

M i ≠ 0

A row of an invertible matrix is nonzero.

source

theorem Cor49.ldl_diagEntries_pos {d : ℕ} (S : Matrix (Fin d) (Fin d) ℝ) (hS : S.PosDef) (i : Fin d) :

0 < LDL.diagEntries hS i

The diagonal entries of the LDL decomposition are positive for PD matrices.

source

theorem Cor49.diagonal_transpose_eq {d : ℕ} (f : Fin d → ℝ) :

(Matrix.diagonal f).transpose = Matrix.diagonal f

Diagonal matrices are symmetric (transpose equals self).

source

theorem Cor49.diagSqrtInv_mul_diag {d : ℕ} (f : Fin d → ℝ) (hf : ∀ (i : Fin d), 0 < f i) :

((Matrix.diagonal fun (i : Fin d) => 1 / √(f i)) * Matrix.diagonal f * Matrix.diagonal fun (i : Fin d) => 1 / √(f i)) = 1

D^{-1/2} * D * D^{-1/2} = I for diagonal D with positive entries.

source

theorem Cor49.spectral_whitening_axiom {d : ℕ} :

0 < d → ∀ (Sig : Matrix (Fin d) (Fin d) ℝ), Sig.PosDef → ∃ (M : Matrix (Fin d) (Fin d) ℝ), M * Sig * M.transpose = 1 ∧ ∀ (A : Matrix (Fin d) (Fin d) ℝ) (t : ℝ), matrixOpNorm A > matrixOpNorm Sig * t → matrixOpNorm (M * A * M.transpose) > t

Spectral theorem: existence of whitening matrix (standard linear algebra background). For any positive definite Σ, there exists M (e.g., Σ^{-1/2}) satisfying: (1) M·Σ·Mᵀ = I (whitening), and (2) Operator norm transfer: if ‖A‖_op > ‖Σ‖_op·t then ‖M·A·Mᵀ‖_op > t.

Property (2) follows from ‖M·A·Mᵀ‖_op ≥ ‖A‖_op / ‖M⁻¹‖²_op = ‖A‖_op / ‖Σ‖_op.

Axiom justification: The textbook does not prove this, instead referencing standard linear algebra. The hypothesis Sig.PosDef ensures the statement is mathematically sound (singular matrices have no whitening matrix).

source

theorem Cor49.subgaussian_whitening_axiom {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (Sig M : Matrix (Fin d) (Fin d) ℝ) :

IsSubGaussianVec μ X (matrixOpNorm Sig) → M * Sig * M.transpose = 1 → IsSubGaussianVec μ (fun (ω : Ω) => applyMatrix M (X ω)) 1

Sub-Gaussian whitening: If X is sub-Gaussian with parameter ‖Σ‖_op and M is a whitening matrix (M·Σ·Mᵀ = I), then MX is sub-Gaussian with parameter 1.

This follows from the MGF characterization: for unit u, u^T(MX) = (M^Tu)^TX, and the quadratic form (M^Tu)^T Σ (M^Tu) = u^T(MΣM^T)u = u^Tu = 1.

Axiom justification: The textbook asserts this as part of the whitening step without explicit proof, referencing the spectral theorem.

source

theorem Cor49.subgaussianVec_exp_integrable {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) (hSubG : IsSubGaussianVec μ X σsq) (i : Fin d) :

MeasureTheory.Integrable (fun (ω : Ω) => Real.exp ((X ω).ofLp i)) μ ∧ MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (-(X ω).ofLp i)) μ

Sub-Gaussian exponential integrability: A sub-Gaussian random vector has integrable exponential coordinates (both positive and negative). The book always assumes random vectors have finite MGF; this theorem makes that standard assumption explicit. Proof: From the IsSubGaussianVec MGF bound, taking the unit vector u = eᵢ (standard basis) gives an individual MGF bound for each coordinate Xᵢ. The conclusion then follows from Rigollet.Chapter4.Thm_4_6.exp_individual_integrable (which axiomatizes the standard fact that a bounded MGF implies integrability of the exponential).

source

theorem Cor49.subgaussian_vec_regularity {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) :

IsSubGaussianVec μ X σsq → (∀ (i : Fin d), MeasureTheory.AEStronglyMeasurable (fun (ω : Ω) => (X ω).ofLp i) μ) ∧ (∀ (i : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp ((X ω).ofLp i)) μ) ∧ ∀ (i : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (-(X ω).ofLp i)) μ

Sub-Gaussian regularity: A sub-Gaussian random vector has measurable coordinates and integrable exponentials (both positive and negative). Part (1) — measurability — follows from subGaussianVec_aestronglyMeasurable (the vector is AEStronglyMeasurable, so each coordinate is too, by composition with the continuous projection EuclideanSpace.proj). Parts (2) and (3) — exponential integrability — are background assumptions not proved in the textbook, provided by subgaussianVec_exp_integrable.

source

theorem Cor49.subgaussian_vec_product_integrable {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] {d : ℕ} (X : Ω → EuclideanSpace ℝ (Fin d)) (σsq : ℝ) (hSubG : IsSubGaussianVec μ X σsq) (a b : Fin d) :

MeasureTheory.Integrable (fun (ω : Ω) => (X ω).ofLp a * (X ω).ofLp b) μ

Sub-Gaussian product integrability (proved from sub-Gaussian tail bounds). For a sub-Gaussian random vector X with variance proxy σ², the product X_a · X_b of any two coordinates is integrable. This follows from:

subgaussian_vec_regularity (measurability + exponential integrability)
subgaussian_product_integrable (L² × L² → L¹ via Cauchy-Schwarz) This is the provable part of the whitening matrix properties.

source

theorem Cor49.whitening_matrix_axiom {d : ℕ} :

0 < d → ∀ (Sig : Matrix (Fin d) (Fin d) ℝ), Sig.PosDef → ∃ (M : Matrix (Fin d) (Fin d) ℝ), M * Sig * M.transpose = 1 ∧ (∀ (A : Matrix (Fin d) (Fin d) ℝ) (t : ℝ), matrixOpNorm A > matrixOpNorm Sig * t → matrixOpNorm (M * A * M.transpose) > t) ∧ ∀ {Ω : Type u_1} [inst : MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → EuclideanSpace ℝ (Fin d)), IsSubGaussianVec μ X (matrixOpNorm Sig) → ∀ (a b : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => (X ω).ofLp a * (X ω).ofLp b) μ

Existence of whitening matrix with all properties (combines spectral theorem

sub-Gaussian integrability). The spectral theorem parts (1-2) come from spectral_whitening_axiom (not proved in textbook), while the integrability part (3) is proved from subgaussian_vec_product_integrable using the sub-Gaussian tail bound machinery.

source

theorem Cor49.matrix_square_root_whitening {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) :

0 < d → ∀ (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (Sig : Matrix (Fin d) (Fin d) ℝ), Sig.PosDef → (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (X i ω).ofLp j * (X i ω).ofLp k ∂μ = Sig j k) → (∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm Sig)) → ProbabilityTheory.iIndepFun X μ → ∃ (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)), (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (Y i ω).ofLp j * (Y i ω).ofLp k ∂μ = if j = k then 1 else 0) ∧ (∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) ∧ (∀ (ω : Ω) (t : ℝ), matrixOpNorm (empiricalCovariance X ω - Sig) > matrixOpNorm Sig * t → matrixOpNorm (empiricalCovariance Y ω - 1) > t) ∧ ProbabilityTheory.iIndepFun Y μ

Matrix square root whitening (reference: Spectral theorem, Rigollet §4.3). For any covariance matrix Σ with E[XXᵀ] = Σ and X ~ subG_d(‖Σ‖_op), the whitened random vectors Y_i = Σ^{-1/2} X_i satisfy identity covariance, unit sub-Gaussian parameter, and the operator norm transfer property.

Proved from whitening_matrix_axiom (which combines spectral_whitening_axiom for the spectral theorem parts, and subgaussian_vec_product_integrable for the integrability part) together with subgaussian_whitening_axiom, bochner_bilinear_transform, and empirical_covariance_linear_map.

source

theorem Cor49.whitened_vectors_exist_L2 {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hd : 0 < d) (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (Sig : Matrix (Fin d) (Fin d) ℝ) (hPD : Sig.PosDef) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (X i ω).ofLp j * (X i ω).ofLp k ∂μ = Sig j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm Sig)) (hIndep : ProbabilityTheory.iIndepFun X μ) :

∃ (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)), (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (Y i ω).ofLp j * (Y i ω).ofLp k ∂μ = if j = k then 1 else 0) ∧ (∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) ∧ (∀ (ω : Ω) (t : ℝ), matrixOpNorm (empiricalCovariance X ω - Sig) > matrixOpNorm Sig * t → matrixOpNorm (empiricalCovariance Y ω - 1) > t) ∧ ProbabilityTheory.iIndepFun Y μ

Whitened vectors existence (Y = Σ^{-1/2}X). For X with E[XXᵀ] = Σ and X ~ subG_d(‖Σ‖_op), there exist whitened vectors Y satisfying identity covariance, unit sub-Gaussian parameter, and the operator norm implication.

Proved from matrix_square_root_whitening, which captures the spectral theorem application referenced in the book's proof of Theorem 4.6.

Helper lemmas for the union bound argument #

source

theorem Cor49.half_div16_eq_div32 (t : ℝ) :

min ((t / 2 / 16) ^ 2) (t / 2 / 16) = min ((t / 32) ^ 2) (t / 32)

source

theorem Cor49.card_sq_le_144d {d : ℕ} {α : Type u_1} {N : Finset α} (hN : ↑N.card ≤ 12 ^ d) :

↑N.card ^ 2 ≤ 144 ^ d

source

theorem Cor49.eq_4_7_identity_L2 {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (hd : 0 < d) (Y : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (Y i ω).ofLp j * (Y i ω).ofLp k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndep : ProbabilityTheory.iIndepFun Y μ) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | matrixOpNorm (empiricalCovariance Y ω - 1) > t} ≤ ENNReal.ofReal (288 ^ d * Real.exp (-(↑n / 2 * min ((t / 32) ^ 2) (t / 32))))

Equation (4.7) (Identity-case concentration for L2/spectral norm).

For i.i.d. Y₁,...,Yₙ with E[YYᵀ] = I and Y ~ subG_d(1), P(‖Σ̂_Y - I‖_op > t) ≤ 288^d · exp[-n/2 · min((t/32)², t/32)]

Proved in the book via:

ε-net N for S^{d-1} with |N| ≤ 12^d (Lemma 1.18)
ε-net reduction (Lemma 4.2): ‖A‖op ≤ 2·max{x,y∈N} |xᵀAy|
Polarization + sub-exponential lifting (Lemma 1.12)
Bernstein's inequality (Theorem 1.13) for each pair
Union bound over |N|² ≤ 144^d pairs, with factor 2 from Bernstein tail

source

theorem Cor49.whitening_reduction_L2 {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hd : 0 < d) (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (Sig : Matrix (Fin d) (Fin d) ℝ) (hPD : Sig.PosDef) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (X i ω).ofLp j * (X i ω).ofLp k ∂μ = Sig j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm Sig)) (hIndep : ProbabilityTheory.iIndepFun X μ) :

Whitening reduction for L2/spectral norm.

For X with E[XXᵀ] = Σ and X ~ subG_d(‖Σ‖_op), the whitened vectors Y = Σ^{-1/2}X satisfy:

E[YYᵀ] = I (identity covariance)
Y ~ subG_d(1) (unit sub-Gaussian parameter)
‖Σ̂_X − Σ‖_op > ‖Σ‖_op·t ⟹ ‖Σ̂_Y − I‖_op > t (from Σ̂_X − Σ = Σ^{1/2}(Σ̂_Y − I)Σ^{1/2} and sub-multiplicativity)

Proved from the axiomatized whitened_vectors_exist_L2.

Assembly: Whitening + eq (4.7) → general concentration #

source

theorem Cor49.general_concentration_L2 {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (hd : 0 < d) (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (Sig : Matrix (Fin d) (Fin d) ℝ) (hPD : Sig.PosDef) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (X i ω).ofLp j * (X i ω).ofLp k ∂μ = Sig j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm Sig)) (hIndep : ProbabilityTheory.iIndepFun X μ) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | matrixOpNorm (empiricalCovariance X ω - Sig) > matrixOpNorm Sig * t} ≤ ENNReal.ofReal (288 ^ d * Real.exp (-(↑n / 2 * min ((t / 32) ^ 2) (t / 32))))

General concentration via whitening + identity-case bound. Combines the whitening reduction with equation (4.7).

Algebraic rate bound #

Setting t = C · covRate d n δ in the exponential tail bound from equation (4.7), we verify that 288^d · exp(-n/2 · min((t/32)², t/32)) ≤ δ for C = 1024.

source

theorem Cor49.exp_16_ge_289 :

289 ≤ Real.exp 16

source

theorem Cor49.h288d_le_exp16d (d : ℕ) :

288 ^ d ≤ Real.exp (16 * ↑d)

source

theorem Cor49.exp_neg_log_bound (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ < 1) :

Real.exp (-(16 * Real.log (1 / δ))) ≤ δ

source

theorem Cor49.rate_algebraic_bound {d : ℕ} (hd : 0 < d) (n : ℕ) (hn : 0 < n) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ < 1) :

288 ^ d * Real.exp (-(↑n / 2 * min ((1024 * covRate d n δ / 32) ^ 2) (1024 * covRate d n δ / 32))) ≤ δ

source

theorem Cor49.opNorm_concentration {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (hd : 0 < d) (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (Sig : Matrix (Fin d) (Fin d) ℝ) (hPD : Sig.PosDef) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (X i ω).ofLp j * (X i ω).ofLp k ∂μ = Sig j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm Sig)) (hIndep : ProbabilityTheory.iIndepFun X μ) (delta : ℝ) (hdelta : 0 < delta) (hdelta1 : delta < 1) :

μ {ω : Ω | matrixOpNorm (empiricalCovariance X ω - Sig) > opNorm_concentration_C₁ * matrixOpNorm Sig * covRate d n delta} ≤ ENNReal.ofReal delta

Theorem 4.6 (Operator norm concentration for i.i.d. sub-Gaussian random vectors).

With probability ≥ 1 − δ, ‖Σ̂ − Σ‖_op ≤ C₁ · ‖Σ‖_op · covRate(d,n,δ).

The proof follows the book (Rigollet, §4.3):

Whitening reduction Y = Σ^{-1/2}X reduces to identity covariance
Equation (4.7) gives the exponential tail bound via ε-net + Bernstein
Algebraic simplification with C₁ = 1024 absorbs the constants

The sub-steps (whitening, per-pair Bernstein) are axiomatized because they require Mathlib infrastructure not currently available (matrix square roots, ε-net covering for L2, sub-exponential Bernstein). The assembly is fully proved.

source

theorem Cor49.davisKahan_eigvec_bound {d : ℕ} (Sig : Matrix (Fin d) (Fin d) ℝ) (θ : ℝ) (v : EuclideanSpace ℝ (Fin d)) (hSpike : IsSpikedCovariance Sig θ v) (SigTilde : Matrix (Fin d) (Fin d) ℝ) (hPSD : SigTilde.PosSemidef) (vhat : EuclideanSpace ℝ (Fin d)) (hvhat : IsLargestEigenvector SigTilde vhat) :

∃ (ε : ℝ), (ε = 1 ∨ ε = -1) ∧ ‖ε • vhat - v‖ ≤ 2 * √2 / θ * matrixOpNorm (SigTilde - Sig)

From Theorem 4.8 (Davis-Kahan): eigenvector error bound (pointwise). For the spiked covariance model Σ = θvv⊤ + I_d, if v̂ is a largest eigenvector of any PSD matrix Σ̃, then min_{ε∈{±1}} ‖εv̂ − v‖ ≤ (2√2/θ) · ‖Σ̃ − Σ‖_op.

Derived from Thm48.theorem_4_8 which gives the squared form: min(‖v̂-v‖², ‖v̂+v‖²) ≤ (8/θ²)‖Σ̃ − Σ‖_op².

source

theorem Cor49.empiricalCovariance_posSemidef {d n : ℕ} {Ω : Type u_1} (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (ω : Ω) :

(empiricalCovariance X ω).PosSemidef

The empirical covariance Σ̂ = (1/n)ΣᵢXᵢXᵢᵀ is always PSD. Each summand XᵢXᵢᵀ = vecMulVec xᵢ (star xᵢ) is PSD, and a nonneg-scaled sum of PSD matrices is PSD.

Helper lemmas for the spiked operator norm computation #

source

theorem Cor49.inner_eq_dotProduct_ofLp {d : ℕ} (v x : EuclideanSpace ℝ (Fin d)) :

inner ℝ v x = v.ofLp ⬝ᵥ x.ofLp

The inner product on EuclideanSpace equals the dot product of the underlying vectors.

source

theorem Cor49.euclidean_norm_sq_eq_dotProduct {d : ℕ} (v : EuclideanSpace ℝ (Fin d)) :

‖v‖ ^ 2 = v.ofLp ⬝ᵥ v.ofLp

The squared EuclideanSpace norm equals the self-dot-product.

source

theorem Cor49.unit_dotProduct_one {d : ℕ} (v : EuclideanSpace ℝ (Fin d)) (hv : ‖v‖ = 1) :

v.ofLp ⬝ᵥ v.ofLp = 1

For a unit vector in EuclideanSpace, the self-dot-product is 1.

source

theorem Cor49.spiked_mulVec_raw {d : ℕ} (θ : ℝ) (v x : Fin d → ℝ) :

(θ • Matrix.vecMulVec v v + 1).mulVec x = (θ * v ⬝ᵥ x) • v + x

The mulVec computation for the spiked covariance model (on raw vectors).

source

theorem Cor49.spiked_opNorm_eq {d : ℕ} (Sig : Matrix (Fin d) (Fin d) ℝ) (θ : ℝ) (v : EuclideanSpace ℝ (Fin d)) (hSpike : IsSpikedCovariance Sig θ v) :

matrixOpNorm Sig = 1 + θ

Under the spiked covariance model Σ = θvv⊤ + I_d, the operator norm is ‖Σ‖_op = 1 + θ. Proof: Upper bound via triangle inequality + Cauchy-Schwarz; lower bound by evaluating at the spike vector v.

Helper: complement probability bound #

source

theorem Cor49.prob_compl_bound {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (S : Set Ω) (δ : ℝ) (hδ : 0 < δ) (hbad : μ Sᶜ ≤ ENNReal.ofReal δ) :

μ S ≥ ENNReal.ofReal (1 - δ)

For a probability measure, if the complement has measure ≤ ofReal δ, then the set has measure ≥ ofReal (1 - δ). This version does NOT require measurability of S, using measure_union_le instead of measure_add_measure_compl.

Corollary 4.9 (PCA consistency via empirical covariance) #

The corollary combines Theorem 4.6 and Theorem 4.8 as follows:

Theorem 4.6 bounds the operator norm of the estimation error: ‖Σ̂ − Σ‖_op ≤ C₁ · ‖Σ‖_op · rate(d,n,δ) w.p. ≥ 1 − δ.
Theorem 4.8 (Davis-Kahan) converts operator norm error to eigenvector error: min_{ε ∈ {±1}} ‖ε v̂ − v‖₂ ≤ (2√2/θ) · ‖Σ̂ − Σ‖_op.
Under the spiked model Σ = θvv⊤ + I_d, the operator norm is ‖Σ‖op = 1 + θ. Substituting: min{ε ∈ {±1}} ‖ε v̂ − v‖₂ ≤ (2√2·C₁/θ) · (1+θ) · rate(d,n,δ) = C · (1+θ)/θ · rate(d,n,δ)

source

theorem Cor49.corollary_4_9 :

∃ (C : ℝ), 0 < C ∧ ∀ (d : ℕ), 0 < d → ∀ {Ω : Type u_1} [inst : MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ), 0 < n → ∀ (X : Fin n → Ω → EuclideanSpace ℝ (Fin d)) (Sig : Matrix (Fin d) (Fin d) ℝ) (θ : ℝ) (v : EuclideanSpace ℝ (Fin d)), IsSpikedCovariance Sig θ v → (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), (X i ω).ofLp j * (X i ω).ofLp k ∂μ = Sig j k) → (∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm Sig)) → ProbabilityTheory.iIndepFun X μ → ∀ (delta : ℝ), 0 < delta → delta < 1 → μ {ω : Ω | ∀ (vhat : EuclideanSpace ℝ (Fin d)), IsLargestEigenvector (empiricalCovariance X ω) vhat → ∃ (ε : ℝ), (ε = 1 ∨ ε = -1) ∧ ‖ε • vhat - v‖ ≤ C * (1 + θ) / θ * covRate d n delta} ≥ ENNReal.ofReal (1 - delta)

Corollary 4.9 (PCA consistency via empirical covariance).

Let X₁,...,Xₙ be n i.i.d. copies of a sub-Gaussian random vector X ∈ ℝ^d with E[XX⊤] = Σ and X ~ subG_d(‖Σ‖_op). Assume Σ = θvv⊤ + I_d (spiked covariance model). Then the largest eigenvector v̂ of the empirical covariance matrix Σ̂ satisfies:

min_{ε∈{±1}} |εv̂ − v|₂ ≲ ((1+θ)/θ) · (√((d+log(1/δ))/n) ∨ (d+log(1/δ))/n)

with probability 1 − δ.

The i.i.d. assumption is encoded through:

hcov: identical distribution (same covariance for all i)
hSubG: identical sub-Gaussian parameter for all i
hIndep: mutual independence of the random vectors

Documentation