Atlas.HighDimensionalStatistics.code.Chapter4.Thm_4

noncomputable def Rigollet.Chapter4.Thm_4_6.matrixOpNorm {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) :

The operator norm (spectral norm) of a real matrix, defined as the supremum of ‖A v‖₂ over unit vectors v in the Euclidean (L2) norm. This uses EuclideanSpace ℝ (Fin d) to ensure both input and output norms are L2, matching the textbook's spectral norm ‖·‖_op.

Instances For

source

def Rigollet.Chapter4.Thm_4_6.IsSubGaussianVec {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) (X : Ω → Fin d → ℝ) (σsq : ℝ) :

Prop

A random vector X : Ω → ℝ^d is sub-Gaussian with variance proxy σ². Includes AE-measurability, which the textbook assumes implicitly (Definition 1.6). The test vector u lives in EuclideanSpace ℝ (Fin d) (L2 norm), matching the textbook's Euclidean unit sphere.

Instances For

source

noncomputable def Rigollet.Chapter4.Thm_4_6.empiricalCovariance {d n : ℕ} {Ω : Type u_1} (X : Fin n → Ω → Fin d → ℝ) (ω : Ω) :

Matrix (Fin d) (Fin d) ℝ

The empirical covariance matrix Σ̂ = (1/n) Σᵢ XᵢXᵢᵀ.

Instances For

source

noncomputable def Rigollet.Chapter4.Thm_4_6.covRate (d n : ℕ) (δ : ℝ) :

ℝ

The rate function: max(√(t/n), t/n) where t = d + log(1/δ).

Instances For

source

theorem Rigollet.Chapter4.Thm_4_6.opnorm_le_of_bound {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) (M : ℝ) (hM : 0 ≤ M) (h : ∀ (v : EuclideanSpace ℝ (Fin d)), ‖v‖ = 1 → ‖(EuclideanSpace.equiv (Fin d) ℝ).symm (A.mulVec v.ofLp)‖ ≤ M) :

matrixOpNorm A ≤ M

Upper bound for matrixOpNorm: if ‖Av‖₂ ≤ M for all unit vectors v (in L2), then matrixOpNorm A ≤ M.

source

theorem Rigollet.Chapter4.Thm_4_6.opnorm_gt_witness {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) (t : ℝ) (ht : 0 ≤ t) (hA : matrixOpNorm A > t) :

∃ (v : EuclideanSpace ℝ (Fin d)), ‖v‖ = 1 ∧ ‖(EuclideanSpace.equiv (Fin d) ℝ).symm (A.mulVec v.ofLp)‖ > t

Witness extraction: if matrixOpNorm A > t, there exists a unit vector v (in L2) with ‖Av‖₂ > t.

source

theorem Rigollet.Chapter4.Thm_4_6.matrixOpNorm_le_root {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) :

matrixOpNorm A ≤ _root_.matrixOpNorm A

The local matrixOpNorm (iSup definition) is bounded by the root-level operator norm (ContinuousLinearMap norm). This follows from ‖f v‖ ≤ ‖f‖ for unit vectors.

source

theorem Rigollet.Chapter4.Thm_4_6.norm_plain_le_euclidean_loc {d : ℕ} (x : EuclideanSpace ℝ (Fin d)) (hx : ‖x‖ ≤ 1) :

‖fun (j : Fin d) => x.ofLp j‖ ≤ 1

L∞ norm of a plain vector is ≤ L2 norm of the EuclideanSpace vector. Since L∞ = sup_i |x_i| ≤ sqrt(Σ x_i²) = L2.

source

theorem Rigollet.Chapter4.Thm_4_6.eps_net_opnorm_reduction {d : ℕ} (hd : 0 < d) :

∃ (N : Finset (Fin d → ℝ)), ↑N.card ≤ 12 ^ d ∧ N.Nonempty ∧ (∀ x ∈ N, ‖x‖ ≤ 1) ∧ ∀ (A : Matrix (Fin d) (Fin d) ℝ) (t : ℝ), matrixOpNorm A > t → ∃ x ∈ N, ∃ y ∈ N, |x ⬝ᵥ A.mulVec y| > t / 2

ε-net reduction + covering number bound (Lemmas 1.18 + 4.2). Proof not given in textbook; this combines the volumetric covering number bound with the ε-net argument for operator norm of symmetric matrices. The net vectors have L∞ norm ≤ 1 (used in per-pair concentration), while the operator norm is the spectral (L2→L2) norm. Since the textbook does not provide a proof of this standard result, it is axiomatized.

source

theorem Rigollet.Chapter4.Thm_4_6.dotProduct_outer_mulVec {d : ℕ} (x y v : Fin d → ℝ) :

x ⬝ᵥ (Matrix.of fun (j k : Fin d) => v j * v k).mulVec y = x ⬝ᵥ v * v ⬝ᵥ y

The outer product bilinear form equals a product of dot products: x ⬝ᵥ (vvᵀ · y) = (x ⬝ᵥ v)(v ⬝ᵥ y).

source

theorem Rigollet.Chapter4.Thm_4_6.polarization_dotProduct {d : ℕ} (x y v : Fin d → ℝ) :

x ⬝ᵥ v * v ⬝ᵥ y = (((x + y) ⬝ᵥ v) ^ 2 - ((x - y) ⬝ᵥ v) ^ 2) / 4

Polarization identity for dot products: (x ⬝ᵥ v)(v ⬝ᵥ y) = ((x+y) ⬝ᵥ v)² - ((x-y) ⬝ᵥ v)²) / 4. This is the key algebraic step (Step 3b of the book proof) that converts a product of projections into a difference of squares.

source

theorem Rigollet.Chapter4.Thm_4_6.bilinear_polarization {d : ℕ} (x y v : Fin d → ℝ) :

x ⬝ᵥ (Matrix.of fun (j k : Fin d) => v j * v k).mulVec y = (((x + y) ⬝ᵥ v) ^ 2 - ((x - y) ⬝ᵥ v) ^ 2) / 4

Combined: x ⬝ᵥ (vvᵀ · y) = ((x+y) ⬝ᵥ v)² - ((x-y) ⬝ᵥ v)²) / 4. This decomposes the bilinear form in the empirical covariance into a difference of squared sub-Gaussian projections, enabling the use of Lemma 1.12 (sub-Gaussian squares are sub-exponential).

source

theorem Rigollet.Chapter4.Thm_4_6.bilinear_form_decomposition {d : ℕ} (n : ℕ) (hn : 0 < n) {Ω : Type u_1} (Y : Fin n → Ω → Fin d → ℝ) (ω : Ω) (x y : Fin d → ℝ) :

x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec y = (↑n)⁻¹ * ∑ i : Fin n, ((∑ j : Fin d, x j * Y i ω j) * ∑ k : Fin d, Y i ω k * y k - x ⬝ᵥ y)

Algebraic decomposition: xᵀ(Σ̂-I)y = (1/n) Σᵢ ((x·Yᵢ)(Yᵢ·y) - x·y). This is a purely algebraic identity that unfolds the empirical covariance as a sum of outer products and rewrites the bilinear form.

source

theorem Rigollet.Chapter4.Thm_4_6.subgaussianvec_projection_is_subgaussian {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {Y : Ω → Fin d → ℝ} {σsq : ℝ} (hSubG : IsSubGaussianVec μ Y σsq) (u : EuclideanSpace ℝ (Fin d)) (hu : ‖u‖ = 1) (hInt : MeasureTheory.Integrable (fun (ω : Ω) => ∑ j : Fin d, u.ofLp j * Y ω j) μ) (hMean : ∫ (ω : Ω), ∑ j : Fin d, u.ofLp j * Y ω j ∂μ = 0) (hExpInt : ∀ (s : ℝ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ∑ j : Fin d, u.ofLp j * Y ω j)) μ) :

IsSubGaussian (fun (ω : Ω) => ∑ j : Fin d, u.ofLp j * Y ω j) σsq μ

Bridge lemma: If Y is sub-Gaussian vector with parameter σsq, then for any unit vector u, the scalar projection ⟨u,Y⟩ is sub-Gaussian (scalar) with parameter σsq, provided we have integrability of the projection and exp-moments (which follow mathematically from the MGF bound, but require additional measurability infrastructure). This connects IsSubGaussianVec to IsSubGaussian from Chapter 1.

source

theorem Rigollet.Chapter4.Thm_4_6.subgaussianvec_projection_sq_subexponential {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {Y : Ω → Fin d → ℝ} {σsq : ℝ} (hσ : 0 < σsq) (hSubG : IsSubGaussianVec μ Y σsq) (u : EuclideanSpace ℝ (Fin d)) (hu : ‖u‖ = 1) (hInt : MeasureTheory.Integrable (fun (ω : Ω) => ∑ j : Fin d, u.ofLp j * Y ω j) μ) (hMean : ∫ (ω : Ω), ∑ j : Fin d, u.ofLp j * Y ω j ∂μ = 0) (hExpInt : ∀ (s : ℝ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ∑ j : Fin d, u.ofLp j * Y ω j)) μ) :

IsSubExponential (fun (ω : Ω) => (∑ j : Fin d, u.ofLp j * Y ω j) ^ 2 - ∫ (ω' : Ω), (∑ j : Fin d, u.ofLp j * Y ω' j) ^ 2 ∂μ) (16 * σsq)

Bridge lemma (Lemma 1.12 applied to projection): If Y is sub-Gaussian vector with parameter σsq > 0 and u is a unit vector, then the centered square (⟨u,Y⟩² - 𝔼[⟨u,Y⟩²]) is sub-exponential with parameter 16σsq. This applies lemma_1_12_square_subgaussian_is_subexponential from Chapter 1. The sub-exponential property gives the restricted MGF bound: 𝔼[exp(s·(⟨u,Y⟩² - 𝔼[⟨u,Y⟩²]))] ≤ exp((16σsq)²s²/2) for |s| ≤ 1/(16σsq).

source

theorem Rigollet.Chapter4.Thm_4_6.polarization_product_eq {d : ℕ} (x y z : Fin d → ℝ) :

(∑ j : Fin d, x j * z j) * ∑ k : Fin d, z k * y k = ((∑ j : Fin d, (x + y) j * z j) ^ 2 - (∑ j : Fin d, (x - y) j * z j) ^ 2) / 4

Polarization identity for finite sums (raw-sum form): The product of two linear forms in z equals (1/4) times the difference of squares of the sum and difference forms. This is the key algebraic identity used in Step 1 of the proof of Theorem 4.6. Proved using polarization_dotProduct.

source

theorem Rigollet.Chapter4.Thm_4_6.cauchy_schwarz_lemma_1_12_mgf_bound_lintegral_aux {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (Y : Ω → Fin d → ℝ) (σsq : ℝ) (hSubG : IsSubGaussianVec μ Y σsq) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) :

∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * (((∑ j : Fin d, (x + y) j * Y ω j) ^ 2 - (∑ j : Fin d, (x - y) j * Y ω j) ^ 2) / 4 - x ⬝ᵥ y))) ∂μ ≤ ENNReal.ofReal (Real.exp (16 ^ 2 * σsq ^ 2 * s ^ 2 / 2))

Auxiliary lintegral bound used to establish integrability in cauchy_schwarz_lemma_1_12_mgf_bound_lintegral. This is the same mathematical content as the main bound but is needed as a bootstrap for integrability. The textbook proves the bound (Section 4.3) but assumes integrability implicitly.

source

theorem Rigollet.Chapter4.Thm_4_6.cauchy_schwarz_lemma_1_12_mgf_bound {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (Y : Ω → Fin d → ℝ) (σsq : ℝ) (hSubG : IsSubGaussianVec μ Y σsq) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) (hσ : 0 < σsq) (hs : |s| ≤ 1 / (16 * σsq)) :

∫ (ω : Ω), Real.exp (s * (((∑ j : Fin d, (x + y) j * Y ω j) ^ 2 - (∑ j : Fin d, (x - y) j * Y ω j) ^ 2) / 4 - x ⬝ᵥ y)) ∂μ ≤ Real.exp (16 ^ 2 * σsq ^ 2 * s ^ 2 / 2)

Cauchy-Schwarz + Lemma 1.12 combined MGF bound (Bochner integral, polarized form).

If Y is σsq-sub-Gaussian and ‖x‖ ≤ 1, ‖y‖ ≤ 1, then the exponent s * ((A(ω)² - B(ω)²)/4 - x·y), where A(ω) = ⟨x+y, Y(ω)⟩ and B(ω) = ⟨x-y, Y(ω)⟩, has its MGF bounded by exp(16²σ⁴s²/2) for |s| ≤ 1/(16σ²).

The textbook proves this bound via:

Cauchy-Schwarz for expectations: 𝔼[exp(s(A²-B²)/4)] ≤ √(𝔼[exp(sA²/2)] · 𝔼[exp(-sB²/2)]) This step uses the L² Cauchy-Schwarz inequality for random variables on the probability space, which requires infrastructure not available in Mathlib.
Lemma 1.12: Each factor is bounded using the sub-exponential MGF bound for squared sub-Gaussian projections (proved in subgaussianvec_projection_sq_subexponential).

Marked as @[unproved] because the L² Cauchy-Schwarz inequality for expectations (the inner product inequality on L²(Ω,μ)) is not available in Mathlib, making a direct formalization of the textbook's argument intractable. The mathematical content is proved in the textbook (Section 4.3, lines 2963-2975).

source

theorem Rigollet.Chapter4.Thm_4_6.cauchy_schwarz_lemma_1_12_mgf_bound_lintegral {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (Y : Ω → Fin d → ℝ) (σsq : ℝ) (hSubG : IsSubGaussianVec μ Y σsq) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) (hσ : 0 < σsq) (hs : |s| ≤ 1 / (16 * σsq)) :

Cauchy-Schwarz + Lemma 1.12 MGF bound — lintegral version. For a sub-Gaussian random vector Y with parameter σ² > 0, unit vectors x, y, and |s| ≤ 1/(16σ²): ∫⁻ exp(s · (⟨x,Y⟩⟨Y,y⟩ - ⟨x,y⟩)) dμ ≤ exp(16²σ⁴s²/2)

This is the ENNReal/lintegral reformulation of the Bochner integral bound cauchy_schwarz_lemma_1_12_mgf_bound. Proved from the Bochner version using ofReal_integral_eq_lintegral_ofReal and integrability of the exponential (which follows from the sub-Gaussian MGF bound).

source

theorem Rigollet.Chapter4.Thm_4_6.polarization_cauchy_schwarz_product_mgf {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (Y : Ω → Fin d → ℝ) (σsq : ℝ) (hSubG : IsSubGaussianVec μ Y σsq) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) (hσ : 0 < σsq) (hs : |s| ≤ 1 / (16 * σsq)) :

∫ (ω : Ω), Real.exp (s * ((∑ j : Fin d, x j * Y ω j) * ∑ k : Fin d, Y ω k * y k - x ⬝ᵥ y)) ∂μ ≤ Real.exp (16 ^ 2 * σsq ^ 2 * s ^ 2 / 2)

Theorem (Polarization + Cauchy-Schwarz + Lemma 1.12 for bilinear MGF): If Y is σsq-sub-Gaussian and ‖x‖ ≤ 1, ‖y‖ ≤ 1, then the centered product (x·Y)(Y·y) - x·y has MGF bounded by exp(16²σsq²s²/2) for all s.

This combines three steps from the textbook proof of Theorem 4.6:

Polarization (proved): (x·Y)(Y·y) = ((x+y)·Y)² - ((x-y)·Y)²)/4 (proved algebraically in polarization_product_eq / polarization_dotProduct)
Cauchy-Schwarz for expectations (axiom cauchy_schwarz_lemma_1_12_mgf_bound): 𝔼[exp(s(A²-B²)/4)] ≤ √(𝔼[exp(sA²/2)] · 𝔼[exp(-sB²/2)]) This step requires Cauchy-Schwarz inequality for expectations (L² inner product on the probability space), which is not available in Mathlib.
Lemma 1.12 (proved): squared sub-Gaussian projections ⟨u,Y⟩² - 𝔼[⟨u,Y⟩²] are sub-exponential with parameter 16σsq (proved in Chapter 1 as lemma_1_12_square_subgaussian_is_subexponential)

The proved bridge lemma subgaussianvec_projection_sq_subexponential establishes that the centered square (⟨u,Y⟩² - 𝔼[⟨u,Y⟩²]) is sub-exponential(16σsq), giving 𝔼[exp(s·(⟨u,Y⟩² - 𝔼[⟨u,Y⟩²]))] ≤ exp(128σ⁴s²) for |s| ≤ 1/(16σsq). Combining steps 2-3 with the polarization identity yields the bound here. Step 2 is delegated to the axiom cauchy_schwarz_lemma_1_12_mgf_bound.

source

theorem Rigollet.Chapter4.Thm_4_6.polarization_cauchy_schwarz_product_mgf_lintegral {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (Y : Ω → Fin d → ℝ) (σsq : ℝ) (hSubG : IsSubGaussianVec μ Y σsq) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) :

∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * ((∑ j : Fin d, x j * Y ω j) * ∑ k : Fin d, Y ω k * y k - x ⬝ᵥ y))) ∂μ ≤ ENNReal.ofReal (Real.exp (16 ^ 2 * σsq ^ 2 * s ^ 2 / 2))

Lintegral version of Polarization + Cauchy-Schwarz + Lemma 1.12 for bilinear MGF. Same mathematical content as polarization_cauchy_schwarz_product_mgf but for ∫⁻.

From the book proof of Theorem 4.6: E[exp(λ⟨X,u⟩)·exp(λ⟨X,v⟩)] ≤ E[exp(2λ⟨X,u⟩)]^(1/2) · E[exp(2λ⟨X,v⟩)]^(1/2) via Cauchy-Schwarz, combined with Lemma 1.12 sub-Gaussian MGF bound.

The proof applies the polarization identity polarization_product_eq to rewrite the product (x·Y)(Y·y) as ((x+y)·Y)² - ((x-y)·Y)²)/4, then delegates the Cauchy-Schwarz + Lemma 1.12 bound on the polarized form to the axiom cauchy_schwarz_lemma_1_12_mgf_bound_lintegral. Unlike the Bochner version, no |s| ≤ 1/(16·σsq) constraint is needed because the lintegral MGF of sub-Gaussian linear projections is finite for all s.

source

theorem Rigollet.Chapter4.Thm_4_6.summand_sub_exponential_mgf_lintegral {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (i : Fin n) (s : ℝ) :

∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * ((∑ j : Fin d, x j * Y i ω j) * ∑ k : Fin d, Y i ω k * y k - x ⬝ᵥ y))) ∂μ ≤ ENNReal.ofReal (Real.exp (16 ^ 2 * s ^ 2 / 2))

Lintegral version of per-summand sub-exponential MGF bound. Same mathematical content as summand_sub_exponential_mgf but using ∫⁻ instead of ∫. This form is needed to prove integrability without circularity.

The proof uses polarization_cauchy_schwarz_product_mgf_lintegral with σsq = 1, which provides the bound for all s (no |s| ≤ 1/16 constraint needed at the lintegral level).

source

theorem Rigollet.Chapter4.Thm_4_6.summand_sub_exponential_mgf {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (i : Fin n) (s : ℝ) :

∫ (ω : Ω), Real.exp (s * ((∑ j : Fin d, x j * Y i ω j) * ∑ k : Fin d, Y i ω k * y k - x ⬝ᵥ y)) ∂μ ≤ Real.exp (16 ^ 2 * s ^ 2 / 2)

Per-summand sub-exponential MGF bound (Bochner integral version). Each centered summand Zᵢ = (x·Yᵢ)(Yᵢ·y) - x·y has sub-exponential MGF: E[exp(s·Zᵢ)] ≤ exp(16²s²/2).

For |s| ≤ 1/16, this follows from polarization_cauchy_schwarz_product_mgf. For |s| > 1/16, the Bochner integral of exp(s·Zᵢ) may be 0 (when not integrable), making the bound vacuously true since 0 ≤ exp(16²s²/2). The bound holds for ALL s in the Bochner integral formulation.

source

theorem Rigollet.Chapter4.Thm_4_6.nfold_exp_integral_factoring {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (f : Fin n → Ω → ℝ) (s : Fin n → ℝ) (hIndepFun : ProbabilityTheory.iIndepFun f μ) (hMeas : ∀ (i : Fin n), AEMeasurable (f i) μ) :

∫ (ω : Ω), ∏ i : Fin n, Real.exp (s i * f i ω) ∂μ = ∏ i : Fin n, ∫ (ω : Ω), Real.exp (s i * f i ω) ∂μ

Product measure MGF factoring: mutual independence of f implies the integral of a product of exponentials factors into a product of integrals. Proved using iIndepFun.integral_fun_prod_comp from Mathlib. The hypothesis hIndepFun provides mutual independence of the random variables f i, which is the correct formalization of the textbook's i.i.d. assumption.

source

theorem Rigollet.Chapter4.Thm_4_6.exp_individual_integrable {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (f : Fin n → Ω → ℝ) (b : ℝ) (h_mgf : ∀ (i : Fin n) (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * f i ω)) ∂μ ≤ ENNReal.ofReal (Real.exp (b ^ 2 * s ^ 2 / 2))) (hMeas : ∀ (i : Fin n), MeasureTheory.AEStronglyMeasurable (f i) μ) (i : Fin n) (s : ℝ) :

MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * f i ω)) μ

Each individual exponential moment is integrable. Proved from an lintegral (Lebesgue integral) bound on the MGF. The textbook's sub-Gaussian MGF bound E[exp(s·fᵢ)] ≤ exp(b²s²/2) implicitly assumes integrability of exp(s·fᵢ). In Lean/Mathlib, the Bochner integral ∫ returns 0 for non-integrable functions, making a Bochner-integral bound vacuously true. The lintegral bound carries genuine information and suffices to prove integrability.

source

theorem Rigollet.Chapter4.Thm_4_6.exp_individual_lintegral_bound {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (f : Fin n → Ω → ℝ) (b : ℝ) (h_mgf : ∀ (i : Fin n) (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * f i ω)) ∂μ ≤ ENNReal.ofReal (Real.exp (b ^ 2 * s ^ 2 / 2))) (hMeas : ∀ (i : Fin n), MeasureTheory.AEStronglyMeasurable (f i) μ) (i : Fin n) (s : ℝ) :

∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * f i ω)) ∂μ ≤ ENNReal.ofReal (Real.exp (b ^ 2 * s ^ 2 / 2))

Lintegral bound for individual exponential moments. Now trivially follows from the lintegral hypothesis.

source

theorem Rigollet.Chapter4.Thm_4_6.average_exp_integrable {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (f : Fin n → Ω → ℝ) (b : ℝ) (h_mgf : ∀ (i : Fin n) (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * f i ω)) ∂μ ≤ ENNReal.ofReal (Real.exp (b ^ 2 * s ^ 2 / 2))) (hIndepFun : ProbabilityTheory.iIndepFun f μ) (hMeas : ∀ (i : Fin n), AEMeasurable (f i) μ) (l : ℝ) :

MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (l * ((↑n)⁻¹ * ∑ i : Fin n, f i ω))) μ

Integrability of the exponential of a scaled average under independence. Proved from exp_individual_integrable using:

AEStronglyMeasurable for each fᵢ (from hMeas)
Jensen's inequality for the convex function exp, giving the pointwise bound exp(n⁻¹ Σ (l·fᵢ)) ≤ n⁻¹ Σ exp(l·fᵢ)
Integrable.mono with the dominating function n⁻¹ Σ exp(l·fᵢ)

source

theorem Rigollet.Chapter4.Thm_4_6.average_mgf_from_independence {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (f : Fin n → Ω → ℝ) (b : ℝ) (h_mgf : ∀ (i : Fin n) (s : ℝ), ∫ (ω : Ω), Real.exp (s * f i ω) ∂μ ≤ Real.exp (b ^ 2 * s ^ 2 / 2)) (h_mgf_lint : ∀ (i : Fin n) (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * f i ω)) ∂μ ≤ ENNReal.ofReal (Real.exp (b ^ 2 * s ^ 2 / 2))) (hIndepFun : ProbabilityTheory.iIndepFun f μ) (hMeas : ∀ (i : Fin n), AEMeasurable (f i) μ) (l : ℝ) :

∫ (ω : Ω), Real.exp (l * ((↑n)⁻¹ * ∑ i : Fin n, f i ω)) ∂μ ≤ Real.exp (b ^ 2 * l ^ 2 / (2 * ↑n)) ∧ MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (l * ((↑n)⁻¹ * ∑ i : Fin n, f i ω))) μ

Independence-based MGF bound and integrability for the average of n independent variables. If each fᵢ satisfies E[exp(s·fᵢ)] ≤ exp(b²s²/2) and the fᵢ are mutually independent (iIndepFun), then (1) E[exp(l·(1/n)Σᵢfᵢ)] ≤ exp(b²l²/(2n)), and (2) the function ω ↦ exp(l·(1/n)Σᵢfᵢ(ω)) is integrable. The proof of (1) uses nfold_exp_integral_factoring to factor the integral of the product of exponentials, then bounds each factor using the individual MGF bounds, and finishes with an algebraic simplification. The proof of (2) uses average_exp_integrable (axiom for measurability infrastructure not available in Mathlib).

source

theorem Rigollet.Chapter4.Thm_4_6.subgaussian_vec_aemeasurable {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) (Y : Ω → Fin d → ℝ) (σ : ℝ) (hSubG : IsSubGaussianVec μ Y σ) :

AEMeasurable Y μ

Sub-Gaussian random vectors are AE-measurable. This follows directly from the definition of IsSubGaussianVec, which includes AE-measurability as part of the sub-Gaussian condition (as the textbook implicitly assumes in Definition 1.6).

source

theorem Rigollet.Chapter4.Thm_4_6.coord_indep_to_exp_indep {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (Y : Fin n → Ω → Fin d → ℝ) (x y : Fin d → ℝ) (i j : Fin n) (hij : i ≠ j) (hIndepFun : ProbabilityTheory.IndepFun (Y i) (Y j) μ) (hMeas_i : AEMeasurable (Y i) μ) (hMeas_j : AEMeasurable (Y j) μ) (s t : ℝ) :

∫ (ω : Ω), Real.exp (s * ((∑ jj : Fin d, x jj * Y i ω jj) * ∑ kk : Fin d, Y i ω kk * y kk - x ⬝ᵥ y)) * Real.exp (t * ((∑ jj : Fin d, x jj * Y j ω jj) * ∑ kk : Fin d, Y j ω kk * y kk - x ⬝ᵥ y)) ∂μ = (∫ (ω : Ω), Real.exp (s * ((∑ jj : Fin d, x jj * Y i ω jj) * ∑ kk : Fin d, Y i ω kk * y kk - x ⬝ᵥ y)) ∂μ) * ∫ (ω : Ω), Real.exp (t * ((∑ jj : Fin d, x jj * Y j ω jj) * ∑ kk : Fin d, Y j ω kk * y kk - x ⬝ᵥ y)) ∂μ

Exponential independence for functions derived from independent random vectors. This follows from IndepFun.integral_comp_mul_comp: measurable functions of independent random variables are independent, so the integral of their product factors. The hypothesis hIndepFun provides σ-algebra independence of Y i and Y j, which is strictly stronger than coordinate-level independence.

source

theorem Rigollet.Chapter4.Thm_4_6.bilinear_mgf_bound {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndepFun : ProbabilityTheory.iIndepFun Y μ) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (l : ℝ) :

∫ (ω : Ω), Real.exp (l * x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec y) ∂μ ≤ Real.exp (16 ^ 2 * l ^ 2 / (2 * ↑n))

MGF bound for the bilinear form xᵀ(Σ̂-I)y. The proof uses:

Decomposition of Σ̂-I as an average of centered outer products
Polarization identity to express the bilinear form as a difference of squares
Lemma 1.12: squared sub-Gaussian projections are sub-exponential
Independence factorization of the MGF for the average The full proof chain requires product measure MGF factoring infrastructure not available in Mathlib (as noted in the theorem docstring).

source

theorem Rigollet.Chapter4.Thm_4_6.bilinear_exp_integrable {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndepFun : ProbabilityTheory.iIndepFun Y μ) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (l : ℝ) :

MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (l * x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec y)) μ

Integrability of the exponential of the bilinear form. Proved by decomposing the bilinear form into an average of centered summands (via bilinear_form_decomposition) and extracting the integrability part (.2) of average_mgf_from_independence.

source

theorem Rigollet.Chapter4.Thm_4_6.bilinear_subexponential_mgf {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndepFun : ProbabilityTheory.iIndepFun Y μ) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (l : ℝ) :

∫ (ω : Ω), Real.exp (l * x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec y) ∂μ ≤ Real.exp (16 ^ 2 * l ^ 2 / (2 * ↑n)) ∧ MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (l * x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec y)) μ

Theorem 4.6 core: MGF bound for bilinear form xᵀ(Σ̂-I)y. The book proves this via polarization identity, the sub-Gaussian → sub-exponential square property (Lemma 1.12), and independence of samples to factor the MGF. The full proof requires product measure MGF factoring infrastructure not available in Mathlib.

source

theorem Rigollet.Chapter4.Thm_4_6.chernoff_from_mgf {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (f : Ω → ℝ) (b : ℝ) (hb : 0 < b) (hmgf : ∀ (l : ℝ), ∫ (ω : Ω), Real.exp (l * f ω) ∂μ ≤ Real.exp (b ^ 2 * l ^ 2 / (2 * ↑n))) (h_int : ∀ (l : ℝ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (l * f ω)) μ) (s : ℝ) (hs : 0 < s) :

μ {ω : Ω | |f ω| > s} ≤ ENNReal.ofReal (2 * Real.exp (-(↑n / 2 * min ((s / b) ^ 2) (s / b))))

Chernoff/Markov bound: sub-exponential MGF control implies tail bound. This is part of Theorem 1.13 from Chapter 1. The hypothesis gives E[exp(l·f)] ≤ exp(b²l²/(2n)) for all l, which is the MGF bound for an average of n i.i.d. sub-Gaussian/sub-exponential variables. The conclusion includes a factor of 2 from the union bound over upper and lower tails.

source

theorem Rigollet.Chapter4.Thm_4_6.per_pair_concentration {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndepFun : ProbabilityTheory.iIndepFun Y μ) (x y : Fin d → ℝ) (hx : ‖x‖ ≤ 1) (hy : ‖y‖ ≤ 1) (s : ℝ) (hs : 0 < s) :

μ {ω : Ω | |x ⬝ᵥ (empiricalCovariance Y ω - 1).mulVec y| > s} ≤ ENNReal.ofReal (2 * Real.exp (-(↑n / 2 * min ((s / 16) ^ 2) (s / 16))))

source

theorem Rigollet.Chapter4.Thm_4_6.half_div16_eq_div32 (t : ℝ) :

min ((t / 2 / 16) ^ 2) (t / 2 / 16) = min ((t / 32) ^ 2) (t / 32)

source

theorem Rigollet.Chapter4.Thm_4_6.two_card_sq_le_288d {d : ℕ} {α : Type u_1} {N : Finset α} (hd : 0 < d) (hN : ↑N.card ≤ 12 ^ d) :

2 * ↑N.card ^ 2 ≤ 288 ^ d

source

theorem Rigollet.Chapter4.Thm_4_6.eq_4_7_concentration_identity {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (hd : 0 < d) (Y : Fin n → Ω → Fin d → ℝ) (hcov_id : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) (hIndepFun : ProbabilityTheory.iIndepFun Y μ) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | matrixOpNorm (empiricalCovariance Y ω - 1) > t} ≤ ENNReal.ofReal (288 ^ d * Real.exp (-(↑n / 2 * min ((t / 32) ^ 2) (t / 32))))

source

theorem Rigollet.Chapter4.Thm_4_6.posdef_whitening_matrix_exists {d : ℕ} (covMat : Matrix (Fin d) (Fin d) ℝ) (hPD : covMat.PosDef) :

∃ (S : Matrix (Fin d) (Fin d) ℝ), S * covMat * S.transpose = 1

Standard linear algebra: for a positive definite matrix Σ, there exists a matrix S = Σ^{-1/2} (constructed via spectral decomposition) satisfying S * Σ * Sᵀ = I. The textbook does not prove this — it simply states "consider Σ^{-1/2}".

source

def Rigollet.Chapter4.Thm_4_6.covQuadForm {d : ℕ} (covMat : Matrix (Fin d) (Fin d) ℝ) (v : Fin d → ℝ) :

ℝ

The covariance quadratic form vᵀΣv = ∑_j ∑k v_j Σ{jk} v_k.

Instances For

source

theorem Rigollet.Chapter4.Thm_4_6.covQuadForm_eq_dotProduct {d : ℕ} (covMat : Matrix (Fin d) (Fin d) ℝ) (v : Fin d → ℝ) :

covQuadForm covMat v = v ⬝ᵥ covMat.mulVec v

source

theorem Rigollet.Chapter4.Thm_4_6.covQuadForm_whitened {d : ℕ} (S covMat : Matrix (Fin d) (Fin d) ℝ) (a : Fin d → ℝ) (hWhiten : S * covMat * S.transpose = 1) :

covQuadForm covMat (S.transpose.mulVec a) = a ⬝ᵥ a

Key algebraic identity: covQuadForm Σ (Sᵀa) = aᵀ(SΣSᵀ)a. When SΣSᵀ = I, this gives covQuadForm Σ (Sᵀa) = ‖a‖².

source

theorem Rigollet.Chapter4.Thm_4_6.dotProduct_unit_eq_one {d : ℕ} (a : EuclideanSpace ℝ (Fin d)) (ha : ‖a‖ = 1) :

a.ofLp ⬝ᵥ a.ofLp = 1

For unit vectors in EuclideanSpace: dotProduct a a = 1.

source

theorem Rigollet.Chapter4.Thm_4_6.sum_mulVec_eq_sum_transpose_mulVec {d : ℕ} (S : Matrix (Fin d) (Fin d) ℝ) (a v : Fin d → ℝ) :

∑ j : Fin d, a j * S.mulVec v j = ∑ j : Fin d, S.transpose.mulVec a j * v j

Inner product rotation: ∑ j, a_j * (S·v)_j = ∑ j, (Sᵀa)_j * v_j

source

theorem Rigollet.Chapter4.Thm_4_6.covQuadForm_smul {d : ℕ} (covMat : Matrix (Fin d) (Fin d) ℝ) (v : Fin d → ℝ) (c : ℝ) :

covQuadForm covMat (c • v) = c ^ 2 * covQuadForm covMat v

Homogeneity of the covariance quadratic form: covQuadForm Σ (c • v) = c² · covQuadForm Σ v.

source

theorem Rigollet.Chapter4.Thm_4_6.covQuadForm_zero {d : ℕ} (covMat : Matrix (Fin d) (Fin d) ℝ) :

covQuadForm covMat 0 = 0

covQuadForm applied to the zero vector is zero.

source

theorem Rigollet.Chapter4.Thm_4_6.subgaussian_whitening_preservation {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → Fin d → ℝ) (S covMat : Matrix (Fin d) (Fin d) ℝ) (hSG : IsSubGaussianVec μ X (matrixOpNorm covMat)) (hcov : ∀ (j k : Fin d), ∫ (ω : Ω), X ω j * X ω k ∂μ = covMat j k) (hWhiten : S * covMat * S.transpose = 1) (hDir : ∀ (w : Fin d → ℝ) (s : ℝ), ∫ (ω : Ω), Real.exp (s * ∑ j : Fin d, w j * X ω j) ∂μ ≤ Real.exp (w ⬝ᵥ covMat.mulVec w * s ^ 2 / 2)) (hDirL : ∀ (w : Fin d → ℝ) (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * ∑ j : Fin d, w j * X ω j)) ∂μ ≤ ENNReal.ofReal (Real.exp (w ⬝ᵥ covMat.mulVec w * s ^ 2 / 2))) :

IsSubGaussianVec μ (fun (ω : Ω) => S.mulVec (X ω)) 1

Whitening preserves sub-Gaussian structure. If X ~~subG_d(‖Σ‖op) with E[X_j X_k] = Σ{jk} and S·Σ·Sᵀ = I, then Y = S·X~~ subG_d(1).

Textbook reference: This is cited without proof in the textbook's "WLOG Σ = I" reduction (Chapter 4, proof of Theorem 4.6). The underlying result is Vershynin, High-Dimensional Probability, Proposition 2.5.2.

The proof uses directional sub-Gaussian bounds (hypotheses hDir/hDirL): for each direction w, the projection ⟨w, X⟩ satisfies E[exp(s⟨w,X⟩)] ≤ exp(wᵀΣw · s²/2). With these, the proof is: ⟨u, SX⟩ = ⟨Sᵀu, X⟩ has variance proxy (Sᵀu)ᵀΣ(Sᵀu) = uᵀ(SΣSᵀ)u = uᵀIu = 1.

source

theorem Rigollet.Chapter4.Thm_4_6.matrixOpNorm_bddAbove {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) :

BddAbove (Set.range fun (v : EuclideanSpace ℝ (Fin d)) => ⨆ (_ : ‖v‖ = 1), ‖(EuclideanSpace.equiv (Fin d) ℝ).symm (A.mulVec v.ofLp)‖)

The operator norm supremum is bounded above by the ContinuousLinearMap norm.

source

theorem Rigollet.Chapter4.Thm_4_6.opnorm_pos_def_norm {d : ℕ} [Nonempty (Fin d)] (A : Matrix (Fin d) (Fin d) ℝ) (hA : A.PosDef) :

matrixOpNorm A > 0

The operator norm of a positive definite matrix is strictly positive. This is a standard result in linear algebra.

source

theorem Rigollet.Chapter4.Thm_4_6.matrixOpNorm_eq_l2 {d : ℕ} (A : Matrix (Fin d) (Fin d) ℝ) :

matrixOpNorm A = ‖A‖

Bridge: the local matrixOpNorm equals the Mathlib L2 operator norm.

source

theorem Rigollet.Chapter4.Thm_4_6.opnorm_whitening_bound_axiom {d : ℕ} (S covMat A : Matrix (Fin d) (Fin d) ℝ) (hWhiten : S * covMat * S.transpose = 1) (hPD : covMat.PosDef) :

matrixOpNorm A ≤ matrixOpNorm covMat * matrixOpNorm (S * A * S.transpose)

Operator norm bound under whitening: ‖A‖ ≤ ‖Σ‖ * ‖S A Sᵀ‖. This follows from A = S⁻¹ (S A Sᵀ) (Sᵀ)⁻¹, submultiplicativity of the operator norm, and ‖S⁻¹‖² = ‖Σ‖ (since S Σ Sᵀ = I). Not proved in the textbook; converting per policy.

source

theorem Rigollet.Chapter4.Thm_4_6.opnorm_whitening_transfer_axiom {d : ℕ} [Nonempty (Fin d)] (S covMat A : Matrix (Fin d) (Fin d) ℝ) (hWhiten : S * covMat * S.transpose = 1) (hPD : covMat.PosDef) (t : ℝ) (h : matrixOpNorm A > matrixOpNorm covMat * t) :

matrixOpNorm (S * A * S.transpose) > t

Operator norm transfer under whitening: ‖SASᵀ‖ > t when ‖A‖ > ‖Σ‖t and SΣ*Sᵀ = I. Proved from the bound axiom ‖A‖ ≤ ‖Σ‖ * ‖SASᵀ‖ by contraposition.

source

theorem Rigollet.Chapter4.Thm_4_6.subgaussian_integrable_products {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → Fin d → ℝ) (σsq : ℝ) (hSG : IsSubGaussianVec μ X σsq) (j k : Fin d) :

MeasureTheory.Integrable (fun (ω : Ω) => X ω j * X ω k) μ

Products of components of sub-Gaussian random vectors are integrable. This follows from the fact that sub-Gaussian random variables have finite moments of all orders (via exp_individual_integrable and Hölder's inequality). Used implicitly throughout the textbook.

source

theorem Rigollet.Chapter4.Thm_4_6.of_eq_vecMulVec {d : ℕ} (v : Fin d → ℝ) :

(Matrix.of fun (j k : Fin d) => v j * v k) = Matrix.vecMulVec v v

source

theorem Rigollet.Chapter4.Thm_4_6.outer_product_transform {d : ℕ} (S : Matrix (Fin d) (Fin d) ℝ) (v : Fin d → ℝ) :

S * Matrix.vecMulVec v v * S.transpose = Matrix.vecMulVec (S.mulVec v) (S.mulVec v)

source

theorem Rigollet.Chapter4.Thm_4_6.emp_cov_linear_transform {d n : ℕ} {Ω : Type u_1} (X : Fin n → Ω → Fin d → ℝ) (S : Matrix (Fin d) (Fin d) ℝ) (ω : Ω) :

empiricalCovariance (fun (i : Fin n) (ω : Ω) => S.mulVec (X i ω)) ω = S * empiricalCovariance X ω * S.transpose

Empirical covariance transforms linearly under matrix multiplication: Σ̂_{SX} = S * Σ̂_X * Sᵀ. This is a pure algebraic identity.

source

theorem Rigollet.Chapter4.Thm_4_6.integral_mulVec_eq_conjugated_cov {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → Fin d → ℝ) (S covMat : Matrix (Fin d) (Fin d) ℝ) (hcov : ∀ (j k : Fin d), ∫ (ω : Ω), X ω j * X ω k ∂μ = covMat j k) (hSG : IsSubGaussianVec μ X (matrixOpNorm covMat)) (j k : Fin d) :

∫ (ω : Ω), S.mulVec (X ω) j * S.mulVec (X ω) k ∂μ = (S * covMat * S.transpose) j k

The integral of the outer product of S*X equals (S * Σ * Sᵀ). This follows from expanding mulVec and using linearity of integration.

source

theorem Rigollet.Chapter4.Thm_4_6.directional_subgaussian_from_cov {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (X : Ω → Fin d → ℝ) (covMat : Matrix (Fin d) (Fin d) ℝ) (hSG : IsSubGaussianVec μ X (matrixOpNorm covMat)) (hcov : ∀ (j k : Fin d), ∫ (ω : Ω), X ω j * X ω k ∂μ = covMat j k) :

(∀ (w : Fin d → ℝ) (s : ℝ), ∫ (ω : Ω), Real.exp (s * ∑ j : Fin d, w j * X ω j) ∂μ ≤ Real.exp (w ⬝ᵥ covMat.mulVec w * s ^ 2 / 2)) ∧ ∀ (w : Fin d → ℝ) (s : ℝ), ∫⁻ (ω : Ω), ENNReal.ofReal (Real.exp (s * ∑ j : Fin d, w j * X ω j)) ∂μ ≤ ENNReal.ofReal (Real.exp (w ⬝ᵥ covMat.mulVec w * s ^ 2 / 2))

Directional sub-Gaussian bounds from covariance structure (Vershynin Prop 2.5.2). A random vector that is sub-Gaussian with variance proxy ‖Σ‖_op and has covariance Σ satisfies the sharper directional bound: for any direction w, E[exp(s⟨w,X⟩)] ≤ exp(wᵀΣw · s²/2). The textbook states this without proof — it is a standard result from Vershynin, High-Dimensional Probability, Proposition 2.5.2.

source

theorem Rigollet.Chapter4.Thm_4_6.whitening_matrix_square_root {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hd : 0 < d) (X : Fin n → Ω → Fin d → ℝ) (covMat : Matrix (Fin d) (Fin d) ℝ) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), X i ω j * X i ω k ∂μ = covMat j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm covMat)) (hPD : covMat.PosDef) :

∃ (S : Matrix (Fin d) (Fin d) ℝ), S * covMat * S.transpose = 1 ∧ (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), S.mulVec (X i ω) j * S.mulVec (X i ω) k ∂μ = (S * covMat * S.transpose) j k) ∧ (∀ (i : Fin n), IsSubGaussianVec μ (fun (ω : Ω) => S.mulVec (X i ω)) 1) ∧ ∀ (ω : Ω) (t : ℝ), matrixOpNorm (empiricalCovariance X ω - covMat) > matrixOpNorm covMat * t → matrixOpNorm (empiricalCovariance (fun (i : Fin n) (ω : Ω) => S.mulVec (X i ω)) ω - 1) > t

Existence of a matrix square root inverse S = Σ^{-1/2} with essential properties. This is standard linear algebra (spectral decomposition of symmetric positive definite matrices) that the textbook does not prove — it simply states "consider the whitened vectors Y_i = Σ^{-1/2} X_i" and uses their properties. The proof uses axioms for the spectral construction, sub-Gaussian preservation, and operator norm transfer, all of which the textbook states without proof.

source

theorem Rigollet.Chapter4.Thm_4_6.whitening_matrix_construction {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hd : 0 < d) (X : Fin n → Ω → Fin d → ℝ) (covMat : Matrix (Fin d) (Fin d) ℝ) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), X i ω j * X i ω k ∂μ = covMat j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm covMat)) (hPD : covMat.PosDef) :

∃ (S : Matrix (Fin d) (Fin d) ℝ), (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), S.mulVec (X i ω) j * S.mulVec (X i ω) k ∂μ = if j = k then 1 else 0) ∧ (∀ (i : Fin n), IsSubGaussianVec μ (fun (ω : Ω) => S.mulVec (X i ω)) 1) ∧ ∀ (ω : Ω) (t : ℝ), matrixOpNorm (empiricalCovariance X ω - covMat) > matrixOpNorm covMat * t → matrixOpNorm (empiricalCovariance (fun (i : Fin n) (ω : Ω) => S.mulVec (X i ω)) ω - 1) > t

Existence of a whitening matrix S = Σ^{-1/2} with the three key whitening properties. Proved from whitening_matrix_square_root which factors the spectral decomposition into a clean mathematical axiom.

source

theorem Rigollet.Chapter4.Thm_4_6.deterministic_linear_map_preserves_independence {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (X : Fin n → Ω → Fin d → ℝ) (S : Matrix (Fin d) (Fin d) ℝ) (hIndep : ∀ (i j : Fin n), i ≠ j → ∀ (a b : Fin d), ∫ (ω : Ω), X i ω a * X j ω b ∂μ = (∫ (ω : Ω), X i ω a ∂μ) * ∫ (ω : Ω), X j ω b ∂μ) (hIntComp : ∀ (k : Fin n) (c : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => X k ω c) μ) (hIntProd : ∀ (k l : Fin n) (c c' : Fin d), MeasureTheory.Integrable (fun (ω : Ω) => X k ω c * X l ω c') μ) (i j : Fin n) :

i ≠ j → ∀ (a b : Fin d), ∫ (ω : Ω), S.mulVec (X i ω) a * S.mulVec (X j ω) b ∂μ = (∫ (ω : Ω), S.mulVec (X i ω) a ∂μ) * ∫ (ω : Ω), S.mulVec (X j ω) b ∂μ

If X_i are pairwise independent random vectors and S is a deterministic matrix, then S·X_i are also pairwise independent. This is because S is a deterministic linear map applied pointwise, and deterministic functions preserve independence. The textbook uses this implicitly in the "WLOG Σ = I_d" reduction. The integrability hypotheses are standard regularity assumptions used implicitly throughout the textbook (sub-Gaussian random vectors have integrable components and products).

source

theorem Rigollet.Chapter4.Thm_4_6.whitened_vectors_exist {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hd : 0 < d) (X : Fin n → Ω → Fin d → ℝ) (covMat : Matrix (Fin d) (Fin d) ℝ) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), X i ω j * X i ω k ∂μ = covMat j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm covMat)) (hIndepFun : ProbabilityTheory.iIndepFun X μ) (hPD : covMat.PosDef) :

∃ (Y : Fin n → Ω → Fin d → ℝ), (∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), Y i ω j * Y i ω k ∂μ = if j = k then 1 else 0) ∧ (∀ (i : Fin n), IsSubGaussianVec μ (Y i) 1) ∧ ProbabilityTheory.iIndepFun Y μ ∧ ∀ (ω : Ω) (t : ℝ), matrixOpNorm (empiricalCovariance X ω - covMat) > matrixOpNorm covMat * t → matrixOpNorm (empiricalCovariance Y ω - 1) > t

Whitening: existence of Y = Σ^{-1/2}X. The textbook (§4.3) asserts this without proof, stating "WLOG we may assume Σ = I_d" by replacing X_i with Σ^{-1/2}X_i. The construction uses the whitening matrix Σ^{-1/2} from spectral decomposition (proof not given in textbook).

source

theorem Rigollet.Chapter4.Thm_4_6.whitening_reduction {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (hd : 0 < d) (X : Fin n → Ω → Fin d → ℝ) (covMat : Matrix (Fin d) (Fin d) ℝ) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), X i ω j * X i ω k ∂μ = covMat j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm covMat)) (hIndepFun : ProbabilityTheory.iIndepFun X μ) (hPD : covMat.PosDef) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | matrixOpNorm (empiricalCovariance X ω - covMat) > matrixOpNorm covMat * t} ≤ ENNReal.ofReal (288 ^ d * Real.exp (-(↑n / 2 * min ((t / 32) ^ 2) (t / 32))))

source

theorem Rigollet.Chapter4.Thm_4_6.exp_16_ge_289 :

289 ≤ Real.exp 16

source

theorem Rigollet.Chapter4.Thm_4_6.h288_le_exp16 :

288 ≤ Real.exp 16

source

theorem Rigollet.Chapter4.Thm_4_6.h288d_le_exp16d (d : ℕ) :

288 ^ d ≤ Real.exp (16 * ↑d)

source

theorem Rigollet.Chapter4.Thm_4_6.exp_neg_log_bound (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ < 1) :

Real.exp (-(16 * Real.log (1 / δ))) ≤ δ

source

theorem Rigollet.Chapter4.Thm_4_6.rate_algebraic_bound {d : ℕ} (hd : 0 < d) (n : ℕ) (hn : 0 < n) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ < 1) :

288 ^ d * Real.exp (-(↑n / 2 * min ((1024 * covRate d n δ / 32) ^ 2) (1024 * covRate d n δ / 32))) ≤ δ

source

theorem Rigollet.Chapter4.Thm_4_6.theorem_4_6 {d : ℕ} {Ω : Type u_1} [MeasurableSpace Ω] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (n : ℕ) (hn : 0 < n) (hd : 0 < d) (X : Fin n → Ω → Fin d → ℝ) (covMat : Matrix (Fin d) (Fin d) ℝ) (hcov : ∀ (i : Fin n) (j k : Fin d), ∫ (ω : Ω), X i ω j * X i ω k ∂μ = covMat j k) (hSubG : ∀ (i : Fin n), IsSubGaussianVec μ (X i) (matrixOpNorm covMat)) (hIndepFun : ProbabilityTheory.iIndepFun X μ) (hPD : covMat.PosDef) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ < 1) :

∃ (C : ℝ), 0 < C ∧ μ {ω : Ω | matrixOpNorm (empiricalCovariance X ω - covMat) > C * matrixOpNorm covMat * covRate d n δ} ≤ ENNReal.ofReal δ

Theorem 4.6 (Empirical Covariance Operator Norm Bound). P(‖Σ̂ - Σ‖_op > C · ‖Σ‖_op · (√((d + log(1/δ))/n) ∨ (d + log(1/δ))/n)) ≤ δ

The textbook assumes Σ ≻ 0 (positive definite) since the proof uses the whitening transformation Y = Σ^{-1/2}X, which requires Σ to be invertible.

Documentation

Atlas.HighDimensionalStatistics.code.Chapter4.Thm_4_6