Standard Gaussian N(0,1) helpers #
The standard Gaussian N(0,1) is symmetric: P(Z < -t) = P(Z > t).
Standard N(0,1) probability bounds #
These are the core results, stated for the standard Gaussian measure gaussianReal 0 1.
Proposition 1.1, lower tail (standard case). P(Z < -t) ≤ (1/(√(2π) · t)) · exp(-t²/2) Follows from symmetry: P(Z < -t) = P(Z > t).
Proposition 1.1, absolute value form (standard case). P(|Z| > t) ≤ (2/(√(2π) · t)) · exp(-t²/2) = √(2/π) · exp(-t²/2) / t Follows from the union bound: P(|Z| > t) = P(Z > t) + P(Z < -t) ≤ 2 · P(Z > t).
General Gaussian N(μ, σ²) bounds #
The general case follows from the standard N(0,1) case by translation and rescaling.
For X N(μ, σ²), we have P(X - μ > t) = P(Z > t/σ) where Z N(0,1).
Applying the standard Mills ratio:
P(Z > t/σ) ≤ (1/√(2π)) · (σ/t) · exp(-t²/(2σ²))
Translation invariance: P_{N(m,v)}(X > m + s) = P_{N(0,v)}(X > s).
Translation invariance for lower tail: P_{N(m,v)}(X < m + s) = P_{N(0,v)}(X < s).
Rescaling (direct form): P_{N(0,σ²)}(X > t) = P_{N(0,1)}(Z > t/σ) for σ > 0.
Symmetry of N(0, σ²): P(X < -s) = P(X > s).
Proposition 1.1, upper tail for general N(μ, σ²).
The rescaling proof yields: P(X − m > t) = P(Z > t/σ) ≤ (σ/(t√(2π))) · exp(−t²/(2σ²))
For σ = 1 this matches the book's formula exactly.
Proposition 1.1, lower tail for general N(μ, σ²).
By symmetry: P(X − m < −t) ≤ (σ/(t√(2π))) · exp(−t²/(2σ²)).
Proposition 1.1, absolute value form for general N(μ, σ²).
Union bound: P(|X − m| > t) ≤ 2σ/(t√(2π)) · exp(−t²/(2σ²)).
Bundled Gaussian tail bound (Proposition 1.1) #
The main declaration combining all three parts of the tail bound into a single theorem.
The book (Chapter 1, Section 1.1, Proposition 1.1) states:
P(X − μ > t) ≤ (1/√(2π)) · exp(−t²/(2σ²)) / t P(X − μ < −t) ≤ (1/√(2π)) · exp(−t²/(2σ²)) / t P(|X − μ| > t) ≤ √(2/π) · exp(−t²/(2σ²)) / t
The standard case σ = 1 (which the book primarily proves) matches the formula exactly. For general σ, the book's rescaling argument P(X − μ > t) = P(Z > t/σ) yields the bound with an additional σ factor in the numerator: σ/(t√(2π)) · exp(−t²/(2σ²)). This is an erratum in the book: the stated formula without σ is incorrect for σ > 1 (counterexample: X ~ N(0,4), t = 3 gives P(X > 3) = P(Z > 1.5) ≈ 0.067 but the book's formula gives exp(−9/8)/(3√(2π)) ≈ 0.043).
Proposition 1.1 (Rigollet, "High Dimensional Statistics"), standard case (σ² = 1).
The book states: Let X be a Gaussian random variable with mean μ and variance σ². Then for any t > 0: P(X − μ > t) ≤ (1/√(2π)) · exp(−t²/(2σ²)) / t P(X − μ < −t) ≤ (1/√(2π)) · exp(−t²/(2σ²)) / t P(|X − μ| > t) ≤ √(2/π) · exp(−t²/(2σ²)) / t
For the standard N(0,1) case (σ = 1), the bound simplifies to: (i) P(Z > t) ≤ 1/(t√(2π)) · exp(−t²/2) (ii) P(Z < −t) ≤ 1/(t√(2π)) · exp(−t²/2) (iii) P(|Z| > t) ≤ 2/(t√(2π)) · exp(−t²/2)
This is the case the book proves directly via Mills' inequality, and the book's stated formula matches exactly when σ = 1.
Corollary of Proposition 1.1 for general N(μ, σ²).
For X N(μ, σ²) with σ > 0, the rescaling P(X − μ > t) = P(Z > t/σ) where Z N(0,1)
yields the bound with σ in the numerator:
(i) P(X − μ > t) ≤ σ/(t√(2π)) · exp(−t²/(2σ²))
(ii) P(X − μ < −t) ≤ σ/(t√(2π)) · exp(−t²/(2σ²))
(iii) P(|X − μ| > t) ≤ 2σ/(t√(2π)) · exp(−t²/(2σ²))
Note: The book's stated formula omits the σ factor. The correct bound from the
rescaling argument includes σ (see gaussian_tail_bound_standard for the σ = 1
case that matches the book's formula exactly).