Theorem 1.13: Bernstein's inequality #
Let X₁, ..., Xₙ be independent with E(Xᵢ) = 0 and Xᵢ ~ subE(λ). Define X̄ = (1/n)Σᵢ Xᵢ. Then for any t > 0:
P(X̄ > t) ∨ P(X̄ < -t) ≤ exp[-n/2 · (t²/λ² ∧ t/λ)]
The proof uses the Chernoff bound with an optimized choice of parameter s₀ = min(1/λ, t/λ²), combined with independence to factor the MGF of the sum.
theorem
bernstein_right_tail
{Ω : Type u_1}
[MeasurableSpace Ω]
{μ : MeasureTheory.Measure Ω}
[MeasureTheory.IsProbabilityMeasure μ]
{n : ℕ}
{X : Fin n → Ω → ℝ}
{t lambda : ℝ}
(hn : 0 < n)
(ht : 0 < t)
(hlam : 0 < lambda)
(h_indep : ProbabilityTheory.iIndepFun X μ)
(h_meas : ∀ (i : Fin n), Measurable (X i))
(h_mgf : ∀ (i : Fin n) (s : ℝ), |s| ≤ 1 / lambda → ProbabilityTheory.mgf (X i) μ s ≤ Real.exp (s ^ 2 * lambda ^ 2 / 2))
(h_int : ∀ (i : Fin n) (s : ℝ), |s| ≤ 1 / lambda → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * X i ω)) μ)
:
Right tail of Bernstein's inequality: P(X̄ ≥ t) ≤ exp(-n/2 · min(t²/λ², t/λ)).
theorem
bernstein_left_tail
{Ω : Type u_1}
[MeasurableSpace Ω]
{μ : MeasureTheory.Measure Ω}
[MeasureTheory.IsProbabilityMeasure μ]
{n : ℕ}
{X : Fin n → Ω → ℝ}
{t lambda : ℝ}
(hn : 0 < n)
(ht : 0 < t)
(hlam : 0 < lambda)
(h_indep : ProbabilityTheory.iIndepFun X μ)
(h_meas : ∀ (i : Fin n), Measurable (X i))
(h_mgf : ∀ (i : Fin n) (s : ℝ), |s| ≤ 1 / lambda → ProbabilityTheory.mgf (X i) μ s ≤ Real.exp (s ^ 2 * lambda ^ 2 / 2))
(h_int : ∀ (i : Fin n) (s : ℝ), |s| ≤ 1 / lambda → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * X i ω)) μ)
:
Left tail of Bernstein's inequality: P(X̄ ≤ -t) ≤ exp(-n/2 · min(t²/λ², t/λ)).
theorem
theorem_1_13_bernstein_inequality
{Ω : Type u_1}
[MeasurableSpace Ω]
{μ : MeasureTheory.Measure Ω}
[MeasureTheory.IsProbabilityMeasure μ]
{n : ℕ}
{X : Fin n → Ω → ℝ}
{t lambda : ℝ}
(hn : 0 < n)
(ht : 0 < t)
(hlam : 0 < lambda)
(h_indep : ProbabilityTheory.iIndepFun X μ)
(h_meas : ∀ (i : Fin n), Measurable (X i))
(h_sub_exp : ∀ (i : Fin n), IsSubExponential (X i) lambda)
(h_int : ∀ (i : Fin n) (s : ℝ), |s| ≤ 1 / lambda → MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * X i ω)) μ)
:
Theorem 1.13 (Bernstein's inequality). Let X₁, ..., Xₙ be independent sub-exponential random variables with parameter λ. Then for any t > 0,
max(P(X̄ ≥ t), P(X̄ ≤ -t)) ≤ exp(-n/2 · min(t²/λ², t/λ))
where X̄ = (1/n)Σᵢ Xᵢ is the sample mean.