Problem 1.2: Chi-squared concentration #
A random variable X has χ²_n (chi-squared with n degrees of freedom) if it has the same distribution as Z₁² + ... + Zₙ², where Z₁, ..., Zₙ are iid N(0,1).
Parts #
(a) Show MGF of Y = Z² - 1 is φ(s) = e^{-s}/√(1-2s) for s < 1/2 (b) Show φ(s) ≤ exp(s²/(1-2s)) for 0 < s < 1/2 (c) Conclude P(Y > 2t + 2√t) ≤ e^{-t} (d) Show if X ~ χ²_n, then with prob ≥ 1-δ: X ≤ n + 2√(n log(1/δ)) + 2 log(1/δ)
The standard Gaussian measure on ℝ, i.e., N(0,1).
Instances For
Problem 1.2 (a): The moment generating function of Y = Z² - 1, where Z ~ N(0,1), satisfies φ(s) = E[e^{sY}] = e^{-s}/√(1-2s) for s < 1/2.
Here stdGaussianMeasure is the standard Gaussian measure on ℝ, and
mgf (fun x => x ^ 2 - 1) stdGaussianMeasure s computes E[e^{s(Z²-1)}]
where Z is the identity random variable on (ℝ, stdGaussianMeasure).
Helpers for Problem 1.2 (b) #
Problem 1.2 (b) (MGF upper bound): For 0 < s < 1/2, the MGF of Y = Z² - 1 satisfies φ(s) = e^{-s}/√(1-2s) ≤ exp(s²/(1-2s)).
Proof: Define g(s) = s²/(1-2s) + s + (1/2)log(1-2s). Then g(0) = 0 and g'(s) = 2s²/(1-2s)² ≥ 0, so g is increasing on [0, 1/2), giving g(s) ≥ 0. This means s²/(1-2s) ≥ -s - (1/2)log(1-2s), and exponentiating yields the result.
Problem 1.2 (c) (Chi-squared tail bound via Chernoff): For Y = Z² - 1 where Z ~ N(0,1), and any t > 0, P(Y > 2t + 2√t) ≤ e^{-t}.
Key idea (from the book): Apply the Chernoff bound P(Y > u) ≤ e^{-su} φ(s) with the MGF bound from part (b). Optimize over s ∈ (0, 1/2), choosing s = (√(1+t) − 1)/2, then use the convexity inequality √(1+u) ≤ 1 + u/2 (hint given in the book) to simplify the exponent to −t when u = 2t + 2√t.
We express P(Z² − 1 > ε) as the measure of {x : ℝ | x² − 1 > ε}
under the standard Gaussian stdGaussianMeasure.
Problem 1.2 (d): If X ~ χ²_n (i.e., X = Z₁² + ... + Zₙ² for independent standard Gaussians), then with probability at least 1 - δ, X ≤ n + 2√(n log(1/δ)) + 2 log(1/δ).
We formalize this as: given n independent standard Gaussian random variables Z_i on a probability space (Ω, μ), the probability that their sum of squares exceeds n + 2√(n log(1/δ)) + 2 log(1/δ) is at most δ.