Documentation

Atlas.HighDimensionalStatistics.code.Chapter3.Thm_3_6

Helper lemmas for product distribution calculations #

theorem Chapter3.prod_dist_factorize {S : Type} [Fintype S] [DecidableEq S] {k : } (h : Fin kS) :
sel : Fin kS, l : Fin k, h l (sel l) = l : Fin k, j : S, h l j

Product distribution factorization: ∑_sel ∏_l h(l, sel l) = ∏_l ∑_j h(l, j).

theorem Chapter3.product_weight_sum {S : Type} [Fintype S] [DecidableEq S] {k : } (w : S) (hw_sum : j : S, w j = 1) :
sel : Fin kS, l : Fin k, w (sel l) = 1

Product weights sum to 1.

theorem Chapter3.product_marginal {S : Type} [Fintype S] [DecidableEq S] {k : } (w : S) (hw_sum : j : S, w j = 1) (f : S) (l₀ : Fin k) :
sel : Fin kS, (∏ l : Fin k, w (sel l)) * f (sel l₀) = j : S, w j * f j

Marginal of the product distribution: E[f(sel l₀)] = ∑ w j * f j.

theorem Chapter3.product_cross_marginal {S : Type} [Fintype S] [DecidableEq S] {k : } (w : S) (hw_sum : j : S, w j = 1) (f g : S) (l₁ l₂ : Fin k) (hne : l₁ l₂) :
sel : Fin kS, (∏ l : Fin k, w (sel l)) * (f (sel l₁) * g (sel l₂)) = (∑ j : S, w j * f j) * j : S, w j * g j

Cross-marginal: E[f(sel l₁) * g(sel l₂)] = E[f] * E[g] for l₁ ≠ l₂.

theorem Chapter3.expected_sum_sq {S : Type} [Fintype S] [DecidableEq S] {k : } (hk : 1 k) (w : S) (hw_sum : j : S, w j = 1) (a : S) :
sel : Fin kS, (∏ l : Fin k, w (sel l)) * (∑ l : Fin k, a (sel l)) ^ 2 = k * j : S, w j * a j ^ 2 + k * (k - 1) * (∑ j : S, w j * a j) ^ 2

E[(∑_l a(sel l))²] = k * E[a²] + k(k-1) * (E[a])².

Weighted averaging existence lemma #

theorem Chapter3.exists_le_of_weighted_sum {ι : Type u_1} [Fintype ι] (w f : ι) (c : ) (hw_nn : ∀ (i : ι), 0 w i) (hw_pos : 0 < i : ι, w i) (havg : i : ι, w i * f i (∑ i : ι, w i) * c) :
∃ (i : ι), f i c

Core probabilistic tool: averaging over product space #

theorem Chapter3.product_averaging_existence {n : } (S : Type) [Fintype S] [Nonempty S] (w : S) (hw_nn : ∀ (j : S), 0 w j) (hw_sum : j : S, w j = 1) (u : SFin n) (g : Fin n) (B : ) (hB : ∀ (j : S), i : Fin n, u j i ^ 2 B) (k : ) (hk : 1 k) :
have μ := fun (i : Fin n) => j : S, w j * u j i; ∃ (sel : Fin kS), i : Fin n, (g i - 1 / k * l : Fin k, u (sel l) i) ^ 2 i : Fin n, (g i - μ i) ^ 2 + B / k

Averaging principle for the product distribution (probabilistic method core). Proved via the bias-variance identity for finite product measures.

Fiberwise sum decomposition for Option-valued selections #

theorem Chapter3.sum_sel_eq_sum_card_mul {S : Type u_1} [Fintype S] [DecidableEq S] {k : } (sel : Fin kS) (f : S) :
l : Fin k, f (sel l) = j : S, {l : Fin k | sel l = j}.card * f j

Decompose a sum over Fin k by grouping according to the values of sel.

Maurey's sparse approximation #

theorem Chapter3.maurey_sparse_approx {n M : } (hn : 0 < n) (hM : 0 < M) (Φ : Matrix (Fin n) (Fin M) ) (f : Fin n) (D : ) (hD : 0 < D) (hNorm : ∀ (j : Fin M), i : Fin n, Φ i j ^ 2 D ^ 2 * n) (k : ) (hk : 1 k) (R : ) (hR : 0 < R) (θ : Fin M) ( : l1norm θ R) :
∃ (θ' : Fin M), support_size θ' 2 * k MSE (Φ.mulVec θ') f MSE (Φ.mulVec θ) f + D ^ 2 * R ^ 2 / k

Maurey's sparse approximation (existence via probabilistic method).

For any θ with ℓ₁ norm ≤ R, ∃ a 2k-sparse θ' with MSE(Φθ') ≤ MSE(Φθ) + D²R²/k.

The proof uses the probabilistic method: construct a discrete distribution on column indices, draw k independent samples, and average the corresponding atoms. The bias-variance identity for the product distribution gives the MSE bound. The existence of a good realization follows from the averaging principle.

theorem Chapter3.theorem_3_6_maurey {n M : } (hn : 0 < n) (hM : 0 < M) (Φ : Matrix (Fin n) (Fin M) ) (f : Fin n) (D : ) (hD : 0 < D) (hNorm : ∀ (j : Fin M), i : Fin n, Φ i j ^ 2 D ^ 2 * n) (k : ) (hk : 1 k) (R : ) (hR : 0 < R) :
⨅ (θ : { θ : Fin M // support_size θ 2 * k }), MSE (Φ.mulVec θ) f (⨅ (θ : { θ : Fin M // l1norm θ R }), MSE (Φ.mulVec θ) f) + D ^ 2 * R ^ 2 / k

Theorem 3.6 (Maurey's empirical method).

Let {φ₁,...,φ_M} be a dictionary with max_{j} |φ_j|₂ ≤ D√n. Then for any k ≥ 1 and R > 0: min_{|θ|₀≤2k} MSE(φ_θ) ≤ min_{|θ|₁≤R} MSE(φ_θ) + D²R²/k