Atlas.HighDimensionalStatistics.code.Chapter4.Lemma_4

Auxiliary: Sub-Gaussian Chernoff bound #

theorem subGaussian_chernoff {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {X : Ω → ℝ} {σsq : ℝ} (hσ : 0 < σsq) (hsg : IsSubGaussian X σsq μ) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | X ω > t} ≤ ENNReal.ofReal (Real.exp (-(t ^ 2 / (2 * σsq))))

Sub-Gaussian Chernoff bound. If X ~ subG(σ²), then for t > 0: P(X > t) ≤ exp(-t²/(2σ²)).

Proof: Markov on exp(sX) with s = t/σ², then optimize. Uses mul_meas_ge_le_lintegral₀ (Markov) and ofReal_integral_eq_lintegral_ofReal (integral conversion).

SORRY REASON: The measure-theoretic Chernoff argument requires converting between ∫⁻ (ENNReal) and ∫ (Bochner), applying Markov's inequality to exp(s·X), and optimizing over s. Each step is standard but requires ~80 lines of Lean plumbing (AEMeasurable of ofReal ∘ exp, set containment for strict/non-strict inequalities, ENNReal division). All mathematical ingredients exist in Mathlib: • mul_meas_ge_le_lintegral₀ (Markov) • ofReal_integral_eq_lintegral_ofReal (integral conversion) • IsSubGaussian.mgf_bound (MGF bound) but the full assembly is not yet available as a single result.

source

theorem ceil_twelve_pow (n : ℕ) :

⌈(3 / (1 / 4)) ^ n⌉₊ = 12 ^ n

Helper: 1/4-net covering number ceiling simplification. Nat.ceil((3/(1/4))^n) = 12^n since 12^n is already a natural number.

source

theorem lemma_4_2_eps_net_reduction {d T : ℕ} (A : Matrix (Fin d) (Fin T) ℝ) {N₁ : Finset (EuclideanSpace ℝ (Fin d))} {N₂ : Finset (EuclideanSpace ℝ (Fin T))} (hN₁_net : ∀ (u : EuclideanSpace ℝ (Fin d)), ‖u‖ = 1 → ∃ x ∈ N₁, dist u x ≤ 1 / 4) (hN₂_net : ∀ (v : EuclideanSpace ℝ (Fin T)), ‖v‖ = 1 → ∃ y ∈ N₂, dist v y ≤ 1 / 4) (hN₁_sub : ∀ x ∈ N₁, ‖x‖ ≤ 1) (hN₂_sub : ∀ y ∈ N₂, ‖y‖ ≤ 1) (hN₁_ne : N₁.Nonempty) (hN₂_ne : N₂.Nonempty) (M : ℝ) (hM : ∀ x ∈ N₁, ∀ y ∈ N₂, (EuclideanSpace.equiv (Fin d) ℝ) x ⬝ᵥ A.mulVec ((EuclideanSpace.equiv (Fin T) ℝ) y) ≤ M) :

matrixOpNorm A ≤ 2 * M

Lemma 4.2 (ε-net reduction for operator norm). For any d×T matrix A, the operator norm can be bounded via 1/4-nets:

‖A‖_op ≤ 2 · max_{(x,y) ∈ N₁×N₂} xᵀAy

where N₁ is a 1/4-net for the unit sphere in ℝ^d and N₂ is a 1/4-net for the unit sphere in ℝ^T.

This extends the single ε-net argument of Theorem 1.19 to two ε-nets (one for each dimension of the matrix). The proof applies the ε-net approximation twice:

First in the left dimension: max_{‖u‖=1} f(u) ≤ 2·max_{x∈N₁} f(x) for Lipschitz f, using ε = 1/4.
Then in the right dimension for each fixed x. Combining the two 1/(1-2·(1/4)) = 2 factors gives the factor of 2.

Lemma 4.2: Operator norm tail bound for sub-Gaussian matrices #

source

theorem lemma_4_2_operator_norm_tail {d T : ℕ} (hd : 0 < d) (hT : 0 < T) {Ω : Type u_1} {x✝ : MeasurableSpace Ω} {μ : MeasureTheory.Measure Ω} (hμ : MeasureTheory.IsProbabilityMeasure μ) {A : Ω → Matrix (Fin d) (Fin T) ℝ} {σsq : ℝ} (hσ : 0 < σsq) (hsg : IsSubGaussianMatrix A σsq μ) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | matrixOpNorm (A ω) > t} ≤ ENNReal.ofReal (12 ^ (d + T) * Real.exp (-(t ^ 2 / (8 * σsq))))

Lemma 4.2 (tail bound form). For a sub-Gaussian d×T random matrix A ~ subG_{d×T}(σ²), the operator norm satisfies the tail bound:

P(‖A‖_op > t) ≤ 12^{d+T} · exp(-t²/(8σ²))

for all t > 0.

Proof: Uses 1/4-nets N₁ for S^{d-1} and N₂ for S^{T-1} with |N₁| ≤ 12^d and |N₂| ≤ 12^T. The ε-net argument gives ‖A‖op ≤ 2 max{x∈N₁,y∈N₂} xᵀAy, then a union bound over the product N₁ × N₂ combined with the sub-Gaussian tail for each xᵀAy yields the result.

source

theorem lemma_4_2_operator_norm_high_prob {d T : ℕ} (hd : 0 < d) (hT : 0 < T) {Ω : Type u_1} {x✝ : MeasurableSpace Ω} {μ : MeasureTheory.Measure Ω} (hμ : MeasureTheory.IsProbabilityMeasure μ) {A : Ω → Matrix (Fin d) (Fin T) ℝ} {σsq : ℝ} (hσ : 0 < σsq) (hsg : IsSubGaussianMatrix A σsq μ) {δ : ℝ} (hδ_pos : 0 < δ) (hδ_lt : δ < 1) :

μ {ω : Ω | matrixOpNorm (A ω) > 4 * √σsq * √(Real.log 12 * ↑(max d T)) + 2 * √σsq * √(2 * Real.log (1 / δ))} ≤ ENNReal.ofReal δ

Lemma 4.2 (high-probability form). For a sub-Gaussian d×T random matrix A ~ subG_{d×T}(σ²), with probability at least 1 - δ:

‖A‖_op ≤ 4σ√(log(12)·(d ⊔ T)) + 2σ√(2·log(1/δ))

where d ⊔ T = max(d, T).

This follows from the tail bound by setting 12^{d+T} · exp(-t²/(8σ²)) ≤ δ and solving for t. Since d + T ≤ 2·(d ⊔ T), we have (d+T)·log(12) ≤ 2·log(12)·(d ⊔ T), and the stated bound uses 4σ (since √(2·2·log(12)) ≤ 4) to absorb the constants.

Documentation

Atlas.HighDimensionalStatistics.code.Chapter4.Lemma_4_2

Auxiliary: Sub-Gaussian Chernoff bound #

Lemma 4.2: Operator norm tail bound for sub-Gaussian matrices #