Atlas.HighDimensionalStatistics.code.Chapter5.Problem_5

theorem problem_5_1a (d : ℕ) (σ : ℝ) (hσ : 0 < σ) (θ θ' : EuclideanSpace ℝ (Fin d)) :

‖θ - θ'‖ ^ 2 / (2 * σ ^ 2) ≥ 0

Problem 5.1(a): KL divergence for Gaussians. For N_d(θ, σ²I) and N_d(θ', σ²I), Example 5.7 gives KL(P_θ, P_{θ'}) = ‖θ - θ'‖² / (2σ²). This expression is always nonnegative. The full closed-form derivation is in Rigollet.Chapter5.Ex_5_7.

source

noncomputable def klBernoulli (p q : ℝ) :

ℝ

KL divergence between Bernoulli distributions Ber(p) and Ber(q): KL(Ber(p) ‖ Ber(q)) = p · log(p/q) + (1-p) · log((1-p)/(1-q)). This is the standard closed-form formula from discrete KL divergence.

Instances For

source

theorem klBernoulli_pos (θ θ' : ℝ) (hθ : 0 < θ) (hθ1 : θ < 1) (hθ' : 0 < θ') (hθ'1 : θ' < 1) (hne : θ ≠ θ') :

0 < klBernoulli θ θ'

KL divergence between distinct Bernoulli distributions is strictly positive. Uses the strict inequality log x < x - 1 for x > 0, x ≠ 1.

source

theorem problem_5_1b (θ θ' : ℝ) (hθ : 0 < θ) (hθ1 : θ < 1) (hθ' : 0 < θ') (hθ'1 : θ' < 1) :

∃ (C : ℝ), 0 < C ∧ C * (θ - θ') ^ 2 ≤ klBernoulli θ θ'

Problem 5.1(b): KL for Bernoulli lower bound. For θ, θ' ∈ (0,1), there exists a positive constant C such that KL(Ber(θ), Ber(θ')) ≥ C · (θ - θ')². The proof uses the strict convexity of KL divergence: the second derivative of KL(Ber(·) ‖ Ber(θ')) at θ' = θ is the Fisher information 1/(θ(1-θ)), which is bounded below on any compact subset of (0,1).

Documentation

Atlas.HighDimensionalStatistics.code.Chapter5.Problem_5_1