Documentation

Atlas.HighDimensionalStatistics.code.Chapter2.Prop_2_16

The Rademacher distribution on ℝ: uniform on {-1, +1}. This is the probability measure assigning mass 1/2 to each of +1 and -1.

Instances For
    def IsIIDRademacherMatrix {Ω : Type u_1} [MeasurableSpace Ω] {n d : } (X : ΩMatrix (Fin n) (Fin d) ) (μ : MeasureTheory.Measure Ω) :

    Predicate: the entries of a random matrix X are i.i.d. Rademacher random variables. This means the n×d entries, viewed as random variables indexed by (Fin n × Fin d), are mutually independent and each has the Rademacher distribution (uniform on {-1, +1}).

    Instances For
      def AssumptionINC {n d : } (X : Matrix (Fin n) (Fin d) ) (k : ) :

      Assumption INC(k): The design matrix X has incoherence k if |XᵀX/n - I_d|∞ ≤ 1/(14k). We formalize this entry-wise: for all i j, |(XᵀX/n){ij} - δ_{ij}| ≤ 1/(14k).

      Instances For
        theorem AssumptionINC.diag_bound {n d : } {X : Matrix (Fin n) (Fin d) } {k : } (h : AssumptionINC X k) (j : Fin d) :
        |(X.transpose * X) j j / n - 1| 1 / (14 * k)

        The diagonal entries of XᵀX/n under INC(k) are close to 1.

        theorem AssumptionINC.offdiag_bound {n d : } {X : Matrix (Fin n) (Fin d) } {k : } (h : AssumptionINC X k) {i j : Fin d} (hij : i j) :
        |(X.transpose * X) i j / n| 1 / (14 * k)

        The off-diagonal entries of XᵀX/n under INC(k) are small.

        Probabilistic Statement of Proposition 2.16 #

        The proof of Proposition 2.16 combines three steps:

        1. Per-pair Hoeffding bound (Theorem 1.9): for each off-diagonal pair (j,l), P(|(XᵀX/n)_{jl}| > t) ≤ 2exp(-nt²/2), since the products εᵢⱼεᵢₗ are iid Rademacher random variables.
        2. Union bound: summing over at most (d choose 2) unordered pairs, P(|XᵀX/n - I|∞ > t) ≤ d²exp(-nt²/2) (the factor 2 from Hoeffding is absorbed: (d choose 2)·2 = d(d-1) ≤ d²).
        3. Analytical bound: setting t = 1/(14k) and using the sample size condition n ≥ 392k²(log(1/δ) + 2log(d)) yields d²exp(-n/(392k²)) ≤ δ.

        Step 1 requires connecting the Rademacher iid structure to Hoeffding's inequality through product measure and coordinate independence — probability infrastructure that the book invokes by reference ("Hoeffding: Theorem 1.9"). Steps 1-2 are therefore axiomatized as a single result. Step 3 is proved below.

        Helper lemmas for the Hoeffding + union bound proof #

        theorem rademacher_diag_eq_n {n d : } (X : Matrix (Fin n) (Fin d) ) (hRad : ∀ (i : Fin n) (j : Fin d), X i j = 1 X i j = -1) (j : Fin d) :
        (X.transpose * X) j j = n

        For a Rademacher matrix, the diagonal entries of XᵀX equal n. This is because ε² = 1 for Rademacher entries.

        theorem inc_compl_subset_offDiag_union {Ω : Type u_1} [MeasurableSpace Ω] {n d k : } (hn : 0 < n) (hk : 0 < k) (X : ΩMatrix (Fin n) (Fin d) ) (hRad : ∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) :
        {ω : Ω | AssumptionINC (X ω) k} p{p : Fin d × Fin d | p.1 p.2}, {ω : Ω | 1 / (14 * k) < |((X ω).transpose * X ω) p.1 p.2 / n|}

        The complement of AssumptionINC for Rademacher entries is contained in the union of off-diagonal bad events, since diagonal entries are exactly 1.

        theorem rademacher_product_iIndepFun {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : } (X : ΩMatrix (Fin n) (Fin d) ) (hIID : IsIIDRademacherMatrix X μ) (j₁ j₂ : Fin d) (hne : j₁ j₂) :
        ProbabilityTheory.iIndepFun (fun (i : Fin n) (ω : Ω) => X ω i j₁ * X ω i j₂) μ

        Independence of row-wise product RVs (infrastructure lemma).

        For an iid Rademacher matrix X, the products ξᵢ = X_{i,j₁}·X_{i,j₂} across different rows i are independent. This follows from the mutual independence of the matrix entries: ξᵢ depends only on entries (i,j₁) and (i,j₂), and different rows involve disjoint sets of entries.

        The book invokes this implicitly when applying Hoeffding to the products. This is standard measure theory (products of independent RVs with disjoint coordinate dependence are independent) but requires substantial infrastructure to formalize from first principles.

        theorem rademacher_entry_measurable {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : } (X : ΩMatrix (Fin n) (Fin d) ) (hIID : IsIIDRademacherMatrix X μ) (i : Fin n) (j : Fin d) :
        Measurable fun (ω : Ω) => X ω i j

        Measurability of entry functions from an iid Rademacher matrix.

        Each entry X_{i,j} is a measurable function. This follows directly from the measurability condition in IsIIDRademacherMatrix.

        The integral of the identity function with respect to the Rademacher measure is zero. This is because E[Rademacher] = (1/2)·1 + (1/2)·(-1) = 0 by symmetry.

        theorem rademacher_entry_integral_zero {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} {f : Ω} (hf : Measurable f) (hmap : MeasureTheory.Measure.map f μ = rademacherMeasure) :
        (ω : Ω), f ω μ = 0

        For a measurable function with Rademacher distribution, its integral (mean) is zero.

        theorem rademacher_product_mean_zero {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : } (X : ΩMatrix (Fin n) (Fin d) ) (hIID : IsIIDRademacherMatrix X μ) (i : Fin n) (j₁ j₂ : Fin d) (hne : j₁ j₂) :
        (ω : Ω), X ω i j₁ * X ω i j₂ μ = 0

        Zero mean of Rademacher products (infrastructure lemma).

        For an iid Rademacher matrix X with j₁ ≠ j₂, the product X_{i,j₁}·X_{i,j₂} has mean zero: E[X_{i,j₁}·X_{i,j₂}] = E[X_{i,j₁}]·E[X_{i,j₂}] = 0·0 = 0. The first equality uses independence of entries in the same row at different columns; the second uses E[Rademacher] = 0 by symmetry.

        theorem hoeffding_per_pair_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d : } (hn : 0 < n) (X : ΩMatrix (Fin n) (Fin d) ) (hRad : ∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) (hIID : IsIIDRademacherMatrix X μ) (j₁ j₂ : Fin d) (hne : j₁ j₂) (t : ) (ht : 0 < t) :
        μ {ω : Ω | t < |((X ω).transpose * X ω) j₁ j₂ / n|} ENNReal.ofReal (2 * Real.exp (-(n * t ^ 2 / 2)))

        Per-pair Hoeffding bound (Theorem 1.9 applied to products of Rademacher entries).

        For fixed j₁ ≠ j₂ and a random Rademacher matrix X, the products ξᵢ = X_{i,j₁}·X_{i,j₂} are independent RVs bounded in [-1,1] with mean 0. By Hoeffding's inequality (Theorem 1.9 with aᵢ=-1, bᵢ=1), the sample mean satisfies: P(|(1/n)∑ξᵢ| > t) ≤ 2·exp(-nt²/2).

        The bound 2·exp(-nt²/2) comes from Hoeffding with ∑(bᵢ-aᵢ)² = 4n: exp(-2(nt)²/(4n)) = exp(-nt²/2), applied to both tails.

        theorem transpose_mul_self_symm {n d : } (X : Matrix (Fin n) (Fin d) ) (i j : Fin d) :
        (X.transpose * X) i j = (X.transpose * X) j i

        XᵀX is symmetric: (XᵀX){i,j} = (XᵀX){j,i}.

        theorem inc_compl_subset_upperTriangle_union {Ω : Type u_1} [MeasurableSpace Ω] {n d k : } (hn : 0 < n) (hk : 0 < k) (X : ΩMatrix (Fin n) (Fin d) ) (hRad : ∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) :
        {ω : Ω | AssumptionINC (X ω) k} p{p : Fin d × Fin d | p.1 < p.2}, {ω : Ω | 1 / (14 * k) < |((X ω).transpose * X ω) p.1 p.2 / n|}

        The complement of AssumptionINC is contained in the union over upper-triangular off-diagonal pairs, using the symmetry of XᵀX.

        theorem card_upperTriangle_le (d : ) :
        2 * {p : Fin d × Fin d | p.1 < p.2}.card d ^ 2

        Card of upper-triangular pairs times 2 ≤ d². The upper triangle has d(d-1)/2 elements, so 2·card = d(d-1) ≤ d².

        theorem hoeffding_union_bound_for_rademacher {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d k : } (hn : 0 < n) (_hd : 2 d) (hk : 0 < k) (X : ΩMatrix (Fin n) (Fin d) ) (hRad : ∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) (hIID : IsIIDRademacherMatrix X μ) (_hMeas : MeasurableSet {ω : Ω | AssumptionINC (X ω) k}) :
        μ {ω : Ω | AssumptionINC (X ω) k} ENNReal.ofReal (d ^ 2 * Real.exp (-(n * (1 / (14 * k)) ^ 2 / 2)))

        Hoeffding + union bound for iid Rademacher matrices (Theorem 1.9 + union bound).

        For an n×d random matrix X with i.i.d. Rademacher (±1) entries, P(¬ INC(k)) ≤ d²·exp(-n·(1/(14k))²/2) = d²·exp(-n/(392k²)).

        This combines Hoeffding's inequality (Theorem 1.9) on each off-diagonal pair (j,l) — noting that the products εᵢⱼ·εᵢₗ are iid Rademacher, so the mean (1/n)∑εᵢⱼεᵢₗ has two-sided tail ≤ 2exp(-nt²/2) — with the union bound over (d choose 2) ≤ d²/2 unordered pairs, absorbing the factor of 2 to get d²·exp(-nt²/2).

        theorem rademacher_incoherence_tail_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d k : } (hn : 0 < n) (hd : 2 d) (hk : 0 < k) (X : ΩMatrix (Fin n) (Fin d) ) (hRad : ∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) (hIID : IsIIDRademacherMatrix X μ) (hMeas : MeasurableSet {ω : Ω | AssumptionINC (X ω) k}) :
        μ {ω : Ω | AssumptionINC (X ω) k} ENNReal.ofReal (d ^ 2 * Real.exp (-(n * (1 / (14 * k)) ^ 2 / 2)))

        Tail bound for the incoherence condition (book proof, lines 1715-1718).

        For an n×d random matrix X with i.i.d. Rademacher (±1) entries, P(¬ INC(k)) ≤ d²·exp(-n·(1/(14k))²/2) = d²·exp(-n/(392k²)).

        This combines Hoeffding's inequality (Theorem 1.9) on each off-diagonal pair with the union bound over (d choose 2) ≤ d² pairs, as stated in the book's proof. The diagonal entries equal 1 exactly since Rademacher squares are 1.

        Proved from the hoeffding_union_bound_for_rademacher axiom.

        theorem prob_ge_of_compl_le {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {S : Set Ω} (hS : MeasurableSet S) {δ : } (h : μ S ENNReal.ofReal δ) ( : 0 δ) :
        μ S ENNReal.ofReal (1 - δ)

        Helper: if the complement of a measurable set has small probability, then the set itself has large probability.

        theorem prop216_tail_bound_le_delta (n d k : ) (hk : 0 < k) (hd : 2 d) (δ : ) (hδ₀ : 0 < δ) (hδ₁ : δ < 1) (hn : n 392 * k ^ 2 * (Real.log (1 / δ) + 2 * Real.log d)) :
        d ^ 2 * Real.exp (-(n * (1 / (14 * k)) ^ 2 / 2)) δ

        Analytical bound (book proof, lines 1720-1726). Setting t = 1/(14k), the tail bound becomes d²·exp(-n/(392k²)). This is at most δ when n ≥ 392k²(log(1/δ) + 2log(d)), since exp(-n/(392k²)) ≤ exp(-(log(1/δ) + 2log(d))) = δ/d².

        theorem prop_2_16_rademacher_incoherence {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n d k : } (hk : 0 < k) (hd : 2 d) (hn : 0 < n) (X : ΩMatrix (Fin n) (Fin d) ) (hRad : ∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) (hIID : IsIIDRademacherMatrix X μ) (hMeasINC : MeasurableSet {ω : Ω | AssumptionINC (X ω) k}) (δ : ) (hδ₀ : 0 < δ) (hδ₁ : δ < 1) (hn_large : n 392 * k ^ 2 * (Real.log (1 / δ) + 2 * Real.log d)) :
        μ {ω : Ω | AssumptionINC (X ω) k} ENNReal.ofReal (1 - δ)

        Proposition 2.16

        Part 2: Existence of INC(k) matrices #

        The book states: "It implies that there exist matrices that satisfy Assumption INC(k) for n ≳ k²log(d)."

        This is a direct consequence of the probabilistic statement: since for any δ ∈ (0,1), the probability of INC(k) is positive when n ≥ 392k²(log(1/δ) + 2log(d)), such matrices must exist. In particular, taking δ = 1/2, we need n ≥ 392k²(log 2 + 2log(d)) ≲ k²log(d) (for d large).

        noncomputable def coinMeasure :

        The fair coin measure on Fin 2: assigns mass 1/2 to each of 0 and 1.

        Instances For
          def coinToSign :
          Fin 2

          Maps Fin 2 to {-1, +1}: 0 ↦ -1, 1 ↦ +1.

          Instances For
            theorem exists_iid_rademacher_space (n d : ) :
            ∃ (Ω : Type) (x : MeasurableSpace Ω) (μ : MeasureTheory.Measure Ω) (_ : MeasureTheory.IsProbabilityMeasure μ) (X : ΩMatrix (Fin n) (Fin d) ), (∀ (ω : Ω) (i : Fin n) (j : Fin d), X ω i j = 1 X ω i j = -1) IsIIDRademacherMatrix X μ ∀ (k : ), MeasurableSet {ω : Ω | AssumptionINC (X ω) k}

            Existence of an i.i.d. Rademacher probability space.

            For any dimensions n, d, there exists a probability space (Ω, μ) equipped with an n×d random matrix X whose entries are i.i.d. Rademacher (±1) random variables, and such that the INC(k) event is measurable for all k.

            This is the product measure construction ∏ᵢⱼ Rademacher on {-1,+1}^{n×d}, which the book assumes implicitly when discussing random Rademacher matrices. Constructed using Ω = (Fin n × Fin d) → Fin 2 with the product of fair coin measures, mapping each coordinate to ±1 via coinToSign.

            theorem prop_2_16_existence_INC {n d k : } (hk : 0 < k) (hd : 2 d) (hn : 0 < n) (hn_large : n 392 * k ^ 2 * (Real.log 2 + 2 * Real.log d)) :
            ∃ (X : Matrix (Fin n) (Fin d) ), (∀ (i : Fin n) (j : Fin d), X i j = 1 X i j = -1) AssumptionINC X k

            Proposition 2.16, Part 2 (Existence of INC(k) matrices).

            There exist n×d matrices satisfying Assumption INC(k) whenever n is sufficiently large relative to k²log(d). Specifically, for any n ≥ 392k²(log 2 + 2log(d)), there exists a matrix X ∈ {-1,+1}^{n×d} satisfying INC(k).

            This follows from Proposition 2.16 (Part 1): taking δ = 1/2, the probability that a random Rademacher matrix satisfies INC(k) is at least 1/2 > 0, so such a matrix must exist (probabilistic method).