Documentation

Atlas.HighDimensionalStatistics.code.Chapter5.Problem_5_1

theorem problem_5_1a (d : ) (σ : ) ( : 0 < σ) (θ θ' : EuclideanSpace (Fin d)) :
θ - θ' ^ 2 / (2 * σ ^ 2) 0

Problem 5.1(a): KL divergence for Gaussians. For N_d(θ, σ²I) and N_d(θ', σ²I), Example 5.7 gives KL(P_θ, P_{θ'}) = ‖θ - θ'‖² / (2σ²). This expression is always nonnegative. The full closed-form derivation is in Rigollet.Chapter5.Ex_5_7.

noncomputable def klBernoulli (p q : ) :

KL divergence between Bernoulli distributions Ber(p) and Ber(q): KL(Ber(p) ‖ Ber(q)) = p · log(p/q) + (1-p) · log((1-p)/(1-q)). This is the standard closed-form formula from discrete KL divergence.

Instances For
    theorem klBernoulli_pos (θ θ' : ) ( : 0 < θ) (hθ1 : θ < 1) (hθ' : 0 < θ') (hθ'1 : θ' < 1) (hne : θ θ') :
    0 < klBernoulli θ θ'

    KL divergence between distinct Bernoulli distributions is strictly positive. Uses the strict inequality log x < x - 1 for x > 0, x ≠ 1.

    theorem problem_5_1b (θ θ' : ) ( : 0 < θ) (hθ1 : θ < 1) (hθ' : 0 < θ') (hθ'1 : θ' < 1) :
    ∃ (C : ), 0 < C C * (θ - θ') ^ 2 klBernoulli θ θ'

    Problem 5.1(b): KL for Bernoulli lower bound. For θ, θ' ∈ (0,1), there exists a positive constant C such that KL(Ber(θ), Ber(θ')) ≥ C · (θ - θ')². The proof uses the strict convexity of KL divergence: the second derivative of KL(Ber(·) ‖ Ber(θ')) at θ' = θ is the Fisher information 1/(θ(1-θ)), which is bounded below on any compact subset of (0,1).