Documentation

Atlas.HighDimensionalStatistics.code.Chapter5.ChernoffAnalyticHelpers

Chernoff Analytic Helpers for Lemma 5.14 #

Key combinatorial and analytic inequalities used in the proof of the Sparse Varshamov-Gilbert bound (Lemma 5.14) from Chapter 5.4 of Rigollet's "High Dimensional Statistics."

Main results #

Combinatorial (binomial coefficient bounds) #

Analytic (power and logarithmic bounds) #

Overview of the proof strategy #

The textbook proof of Lemma 5.14 bounds the size of Hamming balls in the space of k-sparse binary vectors. The key inequality is:

|ball(x)| / |C₀(k)| = P(ρ(ω, x₀) < k/2) ≤ 2^k · (1+d/(2k))^{-k/2}

This file provides the combinatorial underpinning: the ratio C(n,k-t)/C(n,t) (where n = d-k) is bounded below by ((d-2k+1)/k)^{k-2t}. This ratio arises from the consecutive binomial coefficient ratio (n-j)/(j+1) ≥ (n-k+1)/k for j < k ≤ n. Combined with the power inequality (2(R-1))⁴ ≥ 1+R for R ≥ 4, we obtain the exponential bound needed for the Chernoff counting bound.

theorem ChernoffHelpers.binom_ratio_ineq (n j k : ) (hj : j + 1 k) (hk : k n) :
(n - k + 1) * (j + 1) (n - j) * k

Fundamental arithmetic inequality for binomial coefficient ratios. For j < k ≤ n: (n-k+1)·(j+1) ≤ (n-j)·k.

In ℤ, this is equivalent to (n+1)·(k-j-1) ≥ 0, which is always true. This formalizes the fact that each consecutive ratio C(n,j+1)/C(n,j) = (n-j)/(j+1) is at least (n-k+1)/k.

theorem ChernoffHelpers.choose_step_bound (n j k : ) (hj : j + 1 k) (hk : k n) :
(n - k + 1) * n.choose j k * n.choose (j + 1)

For j < k ≤ n: (n-k+1) · C(n, j) ≤ k · C(n, j+1). This formalizes the bound C(n,j+1)/C(n,j) = (n-j)/(j+1) ≥ (n-k+1)/k, using Nat.choose_succ_right_eq which gives C(n,j+1)·(j+1) = C(n,j)·(n-j).

theorem ChernoffHelpers.choose_iterated_bound (n k t m : ) (htm : t + m k) (hk : k n) :
(n - k + 1) ^ m * n.choose t k ^ m * n.choose (t + m)

Iterated binomial coefficient bound: (n-k+1)^m · C(n,t) ≤ k^m · C(n, t+m).

Equivalently: C(n, t+m) / C(n, t) ≥ ((n-k+1)/k)^m, using the product telescope of consecutive ratios. Each step uses choose_step_bound.

theorem ChernoffHelpers.choose_ratio_bound (n k t : ) (ht : 2 * t k) (hk : k n) :
(n - k + 1) ^ (k - 2 * t) * n.choose t k ^ (k - 2 * t) * n.choose (k - t)

Choose ratio bound: (n-k+1)^{k-2t} · C(n,t) ≤ k^{k-2t} · C(n, k-t) for 2t ≤ k ≤ n.

This is the key combinatorial inequality for the Chernoff counting bound. It bounds the ratio C(n,k-t)/C(n,t) ≥ ((n-k+1)/k)^{k-2t}.

In the Lemma 5.14 context with n = d-k and d ≥ 8k: C(d-k, k-t) / C(d-k, t) ≥ ((d-2k+1)/k)^{k-2t} ≥ 6^{k-2t} (since (d-2k+1)/k ≥ (6k+1)/k > 6).

theorem ChernoffHelpers.vandermonde_decomposition (d k : ) (hk : k d) :
d.choose k = ijFinset.antidiagonal k, k.choose ij.1 * (d - k).choose ij.2

Vandermonde convolution specialized to our setting: C(d, k) = Σ_{(i,j) ∈ antidiagonal k} C(k, i) · C(d-k, j).

This decomposition counts k-element subsets of {1,...,d} by partitioning them according to how many elements fall in the first k positions vs the remaining d-k positions. It is the key identity connecting the total count C(d,k) = |C₀(k)| to the Hamming ball size Σ_{t<k/2} C(k,t)·C(d-k,t).

theorem ChernoffHelpers.vandermonde_term_le (d k t : ) (hk : k d) (ht : t k) :
k.choose t * (d - k).choose (k - t) d.choose k

A single term C(k,t)·C(d-k, k-t) in the Vandermonde sum is a lower bound for C(d,k), since it's a non-negative summand.

theorem ChernoffHelpers.power_ineq_for_chernoff (R : ) (hR : R 4) :
(2 * (R - 1)) ^ 4 1 + R

For R ≥ 4: (2(R-1))⁴ ≥ 1 + R.

This is the key algebraic inequality connecting binomial coefficient ratios to the Chernoff bound exponent. When R = d/(2k) with d ≥ 8k, this gives the fourth-power bound needed to convert the log ratio bound (k/2)·log(base) into (k/8)·log(1+R).

theorem ChernoffHelpers.chernoff_power_bound (d k : ) (hk : k > 0) (hd : d 8 * k) :
((d - 2 * k) / k) ^ 4 1 + d / (2 * k)

For d ≥ 8k and k > 0 (in ℝ): ((d-2k)/k)⁴ ≥ 1 + d/(2k). This follows from R = d/(2k) ≥ 4 and (2(R-1))⁴ ≥ 1+R, noting that (d-2k)/k = 2(R-1).

theorem ChernoffHelpers.log_ratio_bound (d k : ) (hk : k 1) (hd : d 8 * k) :
k / 2 * Real.log ((d - 2 * k + 1) / k) k / 8 * Real.log (1 + d / (2 * k))

For d ≥ 8k, k ≥ 1: (k/2) · log((d-2k+1)/k) ≥ (k/8) · log(1 + d/(2k)).

This is the key inequality connecting binomial coefficient ratios to the exponential bound in the Chernoff method of Lemma 5.14.

Proof strategy: Setting R = d/(2k) ≥ 4:

  1. (d-2k+1)/k ≥ 2(R-1) > 0
  2. (2(R-1))⁴ ≥ 1+R (by power_ineq_for_chernoff)
  3. Taking logs: 4·log(2(R-1)) ≥ log(1+R)
  4. Multiplying by k/8: (k/2)·log(2(R-1)) ≥ (k/8)·log(1+R)
  5. Since log((d-2k+1)/k) ≥ log(2(R-1)), the result follows.
theorem ChernoffHelpers.exp_le_ratio_pow (d k : ) (hk : k 1) (hd : d 8 * k) :
Real.exp (k / 8 * Real.log (1 + d / (2 * k))) ((d - 2 * k + 1) / k) ^ (k / 2)

Main exponential bound for the Chernoff counting bound.

For d ≥ 8k, k ≥ 1: exp(k/8 · log(1 + d/(2k))) ≤ ((d-2k+1)/k)^{k/2}

This connects the Chernoff bound exponent to the binomial coefficient ratio. The left side is the packing number lower bound (1+d/(2k))^{k/8}, and the right side bounds the product of consecutive binomial coefficient ratios ∏_{j=0}^{k/2-1} (n-j)/(j+1) ≥ ((n-k+1)/k)^{k/2} where n = d-k.

Combined with choose_iterated_bound, this gives: C(d-k, k) ≥ ((d-2k+1)/k)^k · C(d-k, 0) ≥ (1+d/(2k))^{k/8}.

theorem ChernoffHelpers.choose_mono_left_half (n t : ) (ht : t < n / 2) :
n.choose t n.choose (t + 1)

For t < n/2: C(n, t) ≤ C(n, t+1). This is Nat.choose_le_succ_of_lt_half_left from Mathlib, stated here for reference.

theorem ChernoffHelpers.base_ratio_ge_six (d k : ) (_hk : 1 k) (hd : 8 * k d) :
6 * k d - 2 * k + 1

For d ≥ 8k and k ≥ 1, the base ratio (d-2k+1)/k ≥ 6. This quantifies how much each step in the binomial coefficient ratio telescope contributes: C(d-k, j+1)/C(d-k, j) ≥ 6 for j < k.

theorem ChernoffHelpers.choose_ratio_ge_six (d k t : ) (hk : 1 k) (hd : 8 * k d) (ht : 2 * t < k) :
6 * (d - k).choose t (d - k).choose (k - t)

For d ≥ 8k, k ≥ 1, t < k/2: C(d-k, k-t) ≥ 6 · C(d-k, t). The simplest instance of the ratio bound, using the weakest exponent k-2t ≥ 1.