Documentation

Atlas.HighDimensionalStatistics.code.Chapter1.Thm_1_14

theorem setOf_exists_eq_iUnion {α : Type u_1} {ι : Type u_2} {P : ιαProp} :
{ω : α | ∃ (i : ι), P i ω} = ⋃ (i : ι), {ω : α | P i ω}

Helper: {ω | ∃ i, P i ω} = ⋃ i, {ω | P i ω}

theorem integrable_sup' {Ω' : Type u_1} [MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} {ι : Type u_2} {s : Finset ι} (hs : s.Nonempty) {f : ιΩ'} (hf : is, MeasureTheory.Integrable (f i) μ') :
MeasureTheory.Integrable (fun (ω : Ω') => s.sup' hs fun (i : ι) => f i ω) μ'

Helper: Integrable for sup' over a nonempty finset of integrable functions.

theorem IsSubGaussian.neg {Ω : Type u_1} [MeasurableSpace Ω] {X : Ω} {σsq : } {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (h : IsSubGaussian X σsq μ) :
IsSubGaussian (fun (ω : Ω) => -X ω) σsq μ

If X is sub-Gaussian, then -X is sub-Gaussian with the same variance proxy.

theorem exp_abs_le_sum (s a : ) :
Real.exp (s * |a|) Real.exp (s * a) + Real.exp (s * -a)

exp(s * |a|) ≤ exp(sa) + exp(s(-a)) for any s and a.

Part (2): Upper tail probability bound for the maximum #

theorem subGaussian_max_upper_tail_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } {X : Fin NΩ} {σsq : } (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (t : ) (ht : 0 < t) :
μ {ω : Ω | ∃ (i : Fin N), X i ω > t} ENNReal.ofReal (N * Real.exp (-(t ^ 2 / (2 * σsq))))

Theorem 1.14, Part (2): Upper tail probability bound for the maximum. For X₁,...,X_N sub-Gaussian with variance proxy σ² and t > 0: P(max_{1≤i≤N} Xᵢ > t) ≤ N · exp(-t²/(2σ²))

Proof: By union bound, P(∃ i, Xᵢ > t) ≤ Σᵢ P(Xᵢ > t) ≤ N · exp(-t²/(2σ²)) via Lemma 1.3.

theorem theorem_1_14_upper_tail_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } {X : Fin NΩ} {σsq : } (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (t : ) (ht : 0 < t) :
μ {ω : Ω | ∃ (i : Fin N), X i ω > t} ENNReal.ofReal (N * Real.exp (-(t ^ 2 / (2 * σsq))))

Alias of subGaussian_max_upper_tail_prob.


Theorem 1.14, Part (2): Upper tail probability bound for the maximum. For X₁,...,X_N sub-Gaussian with variance proxy σ² and t > 0: P(max_{1≤i≤N} Xᵢ > t) ≤ N · exp(-t²/(2σ²))

Proof: By union bound, P(∃ i, Xᵢ > t) ≤ Σᵢ P(Xᵢ > t) ≤ N · exp(-t²/(2σ²)) via Lemma 1.3.

theorem theorem_1_14_maximal_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } {X : Fin NΩ} {σsq : } (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (t : ) (ht : 0 < t) :
μ {ω : Ω | ∃ (i : Fin N), X i ω > t} ENNReal.ofReal (N * Real.exp (-(t ^ 2 / (2 * σsq))))

Alias of subGaussian_max_upper_tail_prob.


Theorem 1.14, Part (2): Upper tail probability bound for the maximum. For X₁,...,X_N sub-Gaussian with variance proxy σ² and t > 0: P(max_{1≤i≤N} Xᵢ > t) ≤ N · exp(-t²/(2σ²))

Proof: By union bound, P(∃ i, Xᵢ > t) ≤ Σᵢ P(Xᵢ > t) ≤ N · exp(-t²/(2σ²)) via Lemma 1.3.

Part (4): Absolute value tail probability bound for the maximum #

theorem subGaussian_max_abs_tail_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } {X : Fin NΩ} {σsq : } (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (t : ) (ht : 0 < t) :
μ {ω : Ω | ∃ (i : Fin N), |X i ω| > t} ENNReal.ofReal (2 * N * Real.exp (-(t ^ 2 / (2 * σsq))))

Theorem 1.14, Part (4): Absolute value tail probability bound for the maximum. For X₁,...,X_N sub-Gaussian with variance proxy σ² and t > 0: P(max_{1≤i≤N} |Xᵢ| > t) ≤ 2N · exp(-t²/(2σ²))

Proof: {∃ i, |Xᵢ| > t} ⊆ {∃ i, Xᵢ > t} ∪ {∃ i, Xᵢ < -t}. Each part is bounded by N · exp(-t²/(2σ²)) via Lemma 1.3 and the union bound.

theorem theorem_1_14_abs_tail_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } {X : Fin NΩ} {σsq : } (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (t : ) (ht : 0 < t) :
μ {ω : Ω | ∃ (i : Fin N), |X i ω| > t} ENNReal.ofReal (2 * N * Real.exp (-(t ^ 2 / (2 * σsq))))

Alias of subGaussian_max_abs_tail_prob.


Theorem 1.14, Part (4): Absolute value tail probability bound for the maximum. For X₁,...,X_N sub-Gaussian with variance proxy σ² and t > 0: P(max_{1≤i≤N} |Xᵢ| > t) ≤ 2N · exp(-t²/(2σ²))

Proof: {∃ i, |Xᵢ| > t} ⊆ {∃ i, Xᵢ > t} ∪ {∃ i, Xᵢ < -t}. Each part is bounded by N · exp(-t²/(2σ²)) via Lemma 1.3 and the union bound.

Part (1): Expectation bound for the maximum (parametric form) #

theorem subGaussian_max_expectation_parametric {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 0 < N) {X : Fin NΩ} {σsq : } ( : 0 σsq) (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (s : ) (hs : 0 < s) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => X i ω μ Real.log N / s + σsq * s / 2

Theorem 1.14, Part (1) – parametric form. For all s > 0: E[max Xᵢ] ≤ log(N)/s + σ²·s/2. The closed-form σ√(2 log N) is obtained by choosing s = √(2 log N)/σ.

theorem theorem_1_14_expectation_max_parametric {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 0 < N) {X : Fin NΩ} {σsq : } ( : 0 σsq) (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (s : ) (hs : 0 < s) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => X i ω μ Real.log N / s + σsq * s / 2

Alias of subGaussian_max_expectation_parametric.


Theorem 1.14, Part (1) – parametric form. For all s > 0: E[max Xᵢ] ≤ log(N)/s + σ²·s/2. The closed-form σ√(2 log N) is obtained by choosing s = √(2 log N)/σ.

theorem subGaussian_max_expectation {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 2 N) {X : Fin NΩ} {σ : } ( : 0 < σ) (hX : ∀ (i : Fin N), IsSubGaussian (X i) (σ ^ 2) μ) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => X i ω μ σ * (2 * Real.log N)

Theorem 1.14, Part (1) – closed form. E[max_{1≤i≤N} Xᵢ] ≤ σ√(2 log N), obtained from the parametric form by choosing s = √(2 log N) / σ. Requires σ > 0 and N ≥ 2.

Part (3): Expectation bound for the absolute maximum (parametric form) #

theorem subGaussian_max_abs_expectation_parametric {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 0 < N) {X : Fin NΩ} {σsq : } ( : 0 σsq) (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (s : ) (hs : 0 < s) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => |X i ω| μ Real.log (2 * N) / s + σsq * s / 2

Theorem 1.14, Part (3) – parametric form. For all s > 0: E[max |Xᵢ|] ≤ log(2N)/s + σ²·s/2.

theorem theorem_1_14_expectation_abs_max_parametric {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 0 < N) {X : Fin NΩ} {σsq : } ( : 0 σsq) (hX : ∀ (i : Fin N), IsSubGaussian (X i) σsq μ) (s : ) (hs : 0 < s) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => |X i ω| μ Real.log (2 * N) / s + σsq * s / 2

Alias of subGaussian_max_abs_expectation_parametric.


Theorem 1.14, Part (3) – parametric form. For all s > 0: E[max |Xᵢ|] ≤ log(2N)/s + σ²·s/2.

theorem subGaussian_max_abs_expectation {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 0 < N) {X : Fin NΩ} {σ : } ( : 0 < σ) (hX : ∀ (i : Fin N), IsSubGaussian (X i) (σ ^ 2) μ) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => |X i ω| μ σ * (2 * Real.log (2 * N))

Theorem 1.14, Part (3) – closed form. E[max_{1≤i≤N} |Xᵢ|] ≤ σ√(2 log(2N)), obtained from the parametric form by choosing s = √(2 log(2N)) / σ. Requires σ > 0 and N ≥ 1.

Combined Theorem 1.14 #

theorem subGaussian_maximal_inequality {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {N : } (hN : 2 N) {X : Fin NΩ} {σ : } ( : 0 < σ) (hX : ∀ (i : Fin N), IsSubGaussian (X i) (σ ^ 2) μ) :
have this := ; (ω : Ω), Finset.univ.sup' fun (i : Fin N) => X i ω μ σ * (2 * Real.log N) (∀ (t : ), 0 < tμ {ω : Ω | ∃ (i : Fin N), X i ω > t} ENNReal.ofReal (N * Real.exp (-(t ^ 2 / (2 * σ ^ 2))))) (ω : Ω), Finset.univ.sup' fun (i : Fin N) => |X i ω| μ σ * (2 * Real.log (2 * N)) ∀ (t : ), 0 < tμ {ω : Ω | ∃ (i : Fin N), |X i ω| > t} ENNReal.ofReal (2 * N * Real.exp (-(t ^ 2 / (2 * σ ^ 2))))

Theorem 1.14 (Sub-Gaussian Maximal Inequalities). Let X₁,...,X_N be random variables with Xᵢ ~ subG(σ²). Then all four bounds hold: (1) E[max Xᵢ] ≤ σ√(2 log N) (2) P(max Xᵢ > t) ≤ N · exp(-t²/(2σ²)) (3) E[max |Xᵢ|] ≤ σ√(2 log(2N)) (4) P(max |Xᵢ| > t) ≤ 2N · exp(-t²/(2σ²))