Documentation

Atlas.HighDimensionalStatistics.code.Chapter1.Problem_1_2

Problem 1.2: Chi-squared concentration #

A random variable X has χ²_n (chi-squared with n degrees of freedom) if it has the same distribution as Z₁² + ... + Zₙ², where Z₁, ..., Zₙ are iid N(0,1).

Parts #

(a) Show MGF of Y = Z² - 1 is φ(s) = e^{-s}/√(1-2s) for s < 1/2 (b) Show φ(s) ≤ exp(s²/(1-2s)) for 0 < s < 1/2 (c) Conclude P(Y > 2t + 2√t) ≤ e^{-t} (d) Show if X ~ χ²_n, then with prob ≥ 1-δ: X ≤ n + 2√(n log(1/δ)) + 2 log(1/δ)

The standard Gaussian measure on ℝ, i.e., N(0,1).

Instances For
    theorem problem_1_2a_mgf_chi_squared (s : ) (hs : s < 1 / 2) :
    ProbabilityTheory.mgf (fun (x : ) => x ^ 2 - 1) stdGaussianMeasure s = Real.exp (-s) / (1 - 2 * s)

    Problem 1.2 (a): The moment generating function of Y = Z² - 1, where Z ~ N(0,1), satisfies φ(s) = E[e^{sY}] = e^{-s}/√(1-2s) for s < 1/2.

    Here stdGaussianMeasure is the standard Gaussian measure on ℝ, and mgf (fun x => x ^ 2 - 1) stdGaussianMeasure s computes E[e^{s(Z²-1)}] where Z is the identity random variable on (ℝ, stdGaussianMeasure).

    Helpers for Problem 1.2 (b) #

    theorem deriv_helper_p12b (x : ) (hx : x < 1 / 2) :
    HasDerivAt (fun (s : ) => s ^ 2 / (1 - 2 * s) + s + 1 / 2 * Real.log (1 - 2 * s)) (2 * x ^ 2 / (1 - 2 * x) ^ 2) x

    Derivative of g(s) = s²/(1-2s) + s + (1/2)log(1-2s).

    theorem problem_1_2b_mgf_bound (s : ) (hs_pos : 0 < s) (hs_lt : s < 1 / 2) :
    Real.exp (-s) / (1 - 2 * s) Real.exp (s ^ 2 / (1 - 2 * s))

    Problem 1.2 (b) (MGF upper bound): For 0 < s < 1/2, the MGF of Y = Z² - 1 satisfies φ(s) = e^{-s}/√(1-2s) ≤ exp(s²/(1-2s)).

    Proof: Define g(s) = s²/(1-2s) + s + (1/2)log(1-2s). Then g(0) = 0 and g'(s) = 2s²/(1-2s)² ≥ 0, so g is increasing on [0, 1/2), giving g(s) ≥ 0. This means s²/(1-2s) ≥ -s - (1/2)log(1-2s), and exponentiating yields the result.

    theorem problem_1_2c_tail_bound (t : ) (ht : 0 < t) :

    Problem 1.2 (c) (Chi-squared tail bound via Chernoff): For Y = Z² - 1 where Z ~ N(0,1), and any t > 0, P(Y > 2t + 2√t) ≤ e^{-t}.

    Key idea (from the book): Apply the Chernoff bound P(Y > u) ≤ e^{-su} φ(s) with the MGF bound from part (b). Optimize over s ∈ (0, 1/2), choosing s = (√(1+t) − 1)/2, then use the convexity inequality √(1+u) ≤ 1 + u/2 (hint given in the book) to simplify the exponent to −t when u = 2t + 2√t.

    We express P(Z² − 1 > ε) as the measure of {x : ℝ | x² − 1 > ε} under the standard Gaussian stdGaussianMeasure.

    theorem problem_1_2d_chi_squared_concentration {n : } {Ω : Type u_1} {x✝ : MeasurableSpace Ω} {μ : MeasureTheory.Measure Ω} (hP : MeasureTheory.IsProbabilityMeasure μ) {Z : Fin nΩ} (hZ_meas : ∀ (i : Fin n), Measurable (Z i)) (hZ_dist : ∀ (i : Fin n), MeasureTheory.Measure.map (Z i) μ = ProbabilityTheory.gaussianReal 0 1) (hZ_indep : ProbabilityTheory.iIndepFun Z μ) (δ : ) (hδ_pos : 0 < δ) (hδ_lt : δ < 1) :
    (μ {ω : Ω | i : Fin n, Z i ω ^ 2 > n + 2 * (n * Real.log (1 / δ)) + 2 * Real.log (1 / δ)}).toReal δ

    Problem 1.2 (d): If X ~ χ²_n (i.e., X = Z₁² + ... + Zₙ² for independent standard Gaussians), then with probability at least 1 - δ, X ≤ n + 2√(n log(1/δ)) + 2 log(1/δ).

    We formalize this as: given n independent standard Gaussian random variables Z_i on a probability space (Ω, μ), the probability that their sum of squares exceeds n + 2√(n log(1/δ)) + 2 log(1/δ) is at most δ.