Documentation

Atlas.ProbabilisticMethodsInCombinatorics.code.Chapter10.Entropy

noncomputable def ShannonEntropy.shannonEntropy {S : Type u_1} [Fintype S] (p : PMF S) :

Definition 10.1.1 (Shannon entropy). For a probability mass function $p$ on a finite type $S$, $H(p) = -\sum_s p_s \log p_s = \sum_s \text{negMulLog}(p_s)$.

Instances For
    noncomputable def ShannonEntropy.conditionalEntropy {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] (p : PMF (S × T)) :

    Definition 10.1.6 (Conditional entropy). For a joint distribution $p$ on $S \times T$ with marginal $p_T$, $H(X \mid Y) = \sum_t p_T(t) \sum_s \text{negMulLog}(p(s, t) / p_T(t))$.

    Instances For
      theorem ShannonEntropy.pmf_sum_toReal_eq_one {T : Type u_1} [Fintype T] (q : PMF T) :
      t : T, (q t).toReal = 1

      The sum of (q t).toReal over all $t$ in a finite type equals $1$ for any PMF $q$.

      theorem ShannonEntropy.pmf_map_snd_apply {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq T] (p : PMF (S × T)) (t : T) :
      (PMF.map Prod.snd p) t = s : S, p (s, t)

      The second marginal of a joint PMF: $p_T(t) = \sum_s p(s, t)$.

      theorem ShannonEntropy.pmf_map_fst_apply {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq S] (p : PMF (S × T)) (s : S) :
      (PMF.map Prod.fst p) s = t : T, p (s, t)

      The first marginal of a joint PMF: $p_S(s) = \sum_t p(s, t)$.

      theorem ShannonEntropy.pmf_le_map_snd {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq T] (p : PMF (S × T)) (s : S) (t : T) :
      p (s, t) (PMF.map Prod.snd p) t

      Each joint probability is bounded by the corresponding $T$-marginal: $p(s, t) \leq p_T(t)$.

      theorem ShannonEntropy.total_prob_toReal {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq S] [DecidableEq T] (p : PMF (S × T)) (s : S) :
      t : T, ((PMF.map Prod.snd p) t).toReal * ((p (s, t)).toReal / ((PMF.map Prod.snd p) t).toReal) = ((PMF.map Prod.fst p) s).toReal

      Total-probability identity used in the conditional-entropy bounds: $\sum_t p_T(t) \cdot \frac{p(s, t)}{p_T(t)} = p_S(s)$.

      theorem ShannonEntropy.entropy_of_independent {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq S] [DecidableEq T] (p : PMF (S × T)) (hindep : ∀ (s : S) (t : T), p (s, t) = (PMF.map Prod.fst p) s * (PMF.map Prod.snd p) t) :

      Lemma 10.1.5 (entropy of independent random variables): If $X$ and $Y$ are independent, $H(X, Y) = H(X) + H(Y)$.

      theorem ShannonEntropy.sum_toReal_eq_margY_toReal {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq T] (p : PMF (S × T)) (t : T) :
      s : S, (p (s, t)).toReal = ((PMF.map Prod.snd p) t).toReal

      The fiber-sum identity: $\sum_s (p(s, t)).\text{toReal} = p_T(t).\text{toReal}$.

      theorem ShannonEntropy.joint_eq_zero_of_margY_zero {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq T] (p : PMF (S × T)) (t : T) (ht : (PMF.map Prod.snd p) t = 0) (s : S) :
      p (s, t) = 0

      If a marginal $p_T(t) = 0$ then every joint $p(s, t) = 0$.

      Lemma 10.1.7 (chain rule for entropy): $H(X, Y) = H(Y) + H(X \mid Y)$.

      Conditioning never increases entropy: $H(X \mid Y) \leq H(X)$, proved via Jensen's inequality applied to the concave function $-x \log x$.

      theorem ShannonEntropy.margYZ_sum_eq_margZ {S : Type u_1} {T : Type u_2} {U : Type u_3} [Fintype S] [Fintype T] [Fintype U] [DecidableEq S] [DecidableEq T] [DecidableEq U] (p : PMF (S × T × U)) (u : U) :
      t : T, (PMF.map Prod.snd p) (t, u) = (PMF.map Prod.snd (PMF.map (fun (x : S × T × U) => (x.1, x.2.2)) p)) u

      Summing the $(Y, Z)$-marginal over $Y$ recovers the $Z$-marginal of the projected distribution onto $S \times U$.

      theorem ShannonEntropy.pmf_proj_apply {S : Type u_1} {T : Type u_2} {U : Type u_3} [Fintype S] [Fintype T] [Fintype U] [DecidableEq S] [DecidableEq U] (p : PMF (S × T × U)) (s : S) (u : U) :
      (PMF.map (fun (x : S × T × U) => (x.1, x.2.2)) p) (s, u) = t : T, p (s, t, u)

      Pointwise formula for the projection of a joint PMF on $S \times (T \times U)$ to $S \times U$: $q(s, u) = \sum_t p(s, (t, u))$.

      theorem ShannonEntropy.conditionalEntropy_drop_conditioning {S : Type u_1} {T : Type u_2} {U : Type u_3} [Fintype S] [Fintype T] [Fintype U] [DecidableEq S] [DecidableEq T] [DecidableEq U] (p : PMF (S × T × U)) :
      conditionalEntropy p conditionalEntropy (PMF.map (fun (x : S × T × U) => (x.1, x.2.2)) p)

      Lemma 10.1.10 (dropping conditioning): $H(X \mid Y, Z) \leq H(X \mid Z)$, i.e., dropping a conditioning variable cannot decrease conditional entropy.

      Lemma 10.1.8 (subadditivity of entropy): $H(X, Y) \leq H(X) + H(Y)$.

      noncomputable def ShannonEntropy.marginal {n : } {α : Type u_1} [Fintype α] (p : PMF (Fin nα)) (i : Fin n) :
      PMF α

      The $i$-th coordinate marginal of a PMF on $\alpha^n$.

      Instances For
        theorem ShannonEntropy.shannonEntropy_map_equiv {S : Type u_1} {T : Type u_2} [Fintype S] [Fintype T] [DecidableEq S] [DecidableEq T] (p : PMF S) (e : S T) :

        Shannon entropy is invariant under bijective relabeling of the sample space.

        theorem ShannonEntropy.marginal_succFunEquiv_snd {n : } {α : Type u_1} [Fintype α] [DecidableEq α] (p : PMF (Fin (n + 1)α)) :

        Under the equivalence $\alpha^{n+1} \simeq \alpha^n \times \alpha$, the second coordinate marginal corresponds to the marginal at the last index.

        theorem ShannonEntropy.marginal_succFunEquiv_fst {n : } {α : Type u_1} [Fintype α] [DecidableEq α] (p : PMF (Fin (n + 1)α)) (i : Fin n) :

        Under the equivalence $\alpha^{n+1} \simeq \alpha^n \times \alpha$, the $i$-th marginal of the first factor corresponds to the marginal at $\text{castSucc } i$.

        theorem ShannonEntropy.entropy_subadditive_general {n : } {α : Type u_1} [Fintype α] [DecidableEq α] (p : PMF (Fin nα)) :

        General subadditivity of entropy (Lemma 10.1.8 extended): for any joint PMF on $\alpha^n$, $H(X_1, \dots, X_n) \leq \sum_i H(X_i)$.