Corollary 3.7: Combined ℓ₀/ℓ₁ Oracle Inequality via Maurey #

Under the assumptions of Theorem 3.4 (BIC sparse oracle inequality) and a normalized dictionary (max_j |φ_j|₂ ≤ √n), Maurey's argument (Theorem 3.6) yields:

MSE(φ_{θ̂}) ≤ inf_θ { 2·MSE(φ_θ) + C · min(σ²|θ|₀ log(eM)/n, σ|θ|₁ √(log(eM)/n)) } + C σ² log(1/δ) / n

with probability at least 1 − δ.

The proof combines:

The ℓ₀ oracle inequality from Thm 3.4 (with α = 1/3).
Maurey's approximation (Thm 3.6) converting ℓ₁ to ℓ₀ bounds. Key step: optimizing min_k(R²/k + Cσ²k log(eM)/n) ≈ σR√(log(eM)/n).

Local definitions #

source

noncomputable def Rigollet.Chapter3.MSE_37 {n : ℕ} (fhat f : Fin n → ℝ) :

ℝ

MSE of a prediction vector f̂ relative to the true signal f

Instances For

source

noncomputable def Rigollet.Chapter3.support_size_37 {M : ℕ} (θ : Fin M → ℝ) :

ℕ

ℓ₀ "norm" (support size) of a vector

Instances For

source

noncomputable def Rigollet.Chapter3.l1norm_37 {M : ℕ} (θ : Fin M → ℝ) :

ℝ

ℓ₁ norm of a vector

Instances For

source

def Rigollet.Chapter3.DictNormalized {n M : ℕ} (Φ : Matrix (Fin n) (Fin M) ℝ) :

Prop

Dictionary normalization condition: max_j |φ_j|₂ ≤ √n. Equivalently, max_j Σᵢ (Φ_{i,j})² ≤ n. This is the normalization assumed in Corollary 3.7 (D = 1 case of Thm 3.6).

Instances For

Algebraic core #

source

theorem Rigollet.Chapter3.le_add_min_of_le_both (x a t₁ t₂ c : ℝ) (h₁ : x ≤ a + t₁ + c) (h₂ : x ≤ a + t₂ + c) :

x ≤ a + min t₁ t₂ + c

If x ≤ a + t₁ + c and x ≤ a + t₂ + c, then x ≤ a + min t₁ t₂ + c.

source

theorem Rigollet.Chapter3.cor_3_7_combined_oracle_det {n M : ℕ} (Φ : Matrix (Fin n) (Fin M) ℝ) (f : Fin n → ℝ) (θhat : Fin M → ℝ) (σ C logM_n tail : ℝ) (h_l0 : ∀ (θ : Fin M → ℝ), MSE_37 (Φ.mulVec θhat) f ≤ 2 * MSE_37 (Φ.mulVec θ) f + C * σ ^ 2 * ↑(support_size_37 θ) * logM_n + tail) (h_l1 : ∀ (θ : Fin M → ℝ), MSE_37 (Φ.mulVec θhat) f ≤ 2 * MSE_37 (Φ.mulVec θ) f + C * σ * l1norm_37 θ * √logM_n + tail) (θ : Fin M → ℝ) :

MSE_37 (Φ.mulVec θhat) f ≤ 2 * MSE_37 (Φ.mulVec θ) f + min (C * σ ^ 2 * ↑(support_size_37 θ) * logM_n) (C * σ * l1norm_37 θ * √logM_n) + tail

Corollary 3.7 (algebraic core): Combined ℓ₀/ℓ₁ oracle inequality.

Given:

An ℓ₀ oracle inequality (Thm 3.4, α = 1/3): MSE_hat ≤ 2 · MSE(φ_θ) + C σ² |θ|₀ log(eM)/n + tail
An ℓ₁ oracle inequality (Thm 3.4 + Thm 3.6 / Maurey): MSE_hat ≤ 2 · MSE(φ_θ) + C σ |θ|₁ √(log(eM)/n) + tail

Conclude the combined bound with min of both penalties.

Bridge lemmas between Chapter3 and Rigollet.Chapter3 definitions #

source

theorem Rigollet.Chapter3.MSE_37_eq_Chapter3_MSE {n : ℕ} (fhat f : Fin n → ℝ) :

MSE_37 fhat f = Chapter3.MSE fhat f

source

theorem Rigollet.Chapter3.support_size_37_eq_Chapter3 {M : ℕ} (θ : Fin M → ℝ) :

support_size_37 θ = Chapter3.support_size θ

source

theorem Rigollet.Chapter3.l1norm_37_eq_Chapter3 {M : ℕ} (θ : Fin M → ℝ) :

l1norm_37 θ = Chapter3.l1norm θ

Maurey's approximation (Theorem 3.6) #

Theorem 3.6 states: for any θ' ∈ ℝ^M and any k ≥ 1, there exists θ with |θ|₀ ≤ 2k such that MSE(φ_θ) ≤ MSE(φ_{θ'}) + |θ'|₁²/k, provided the dictionary is normalized (max_j |φ_j|₂ ≤ √n, i.e., D = 1).

Derived from Chapter3.maurey_sparse_approx (the probabilistic core, axiomatized because the proof uses random rounding not available in Mathlib). Reference: Rigollet, Theorem 3.6 (Maurey's empirical method).

source

theorem Rigollet.Chapter3.maurey_approximation_37 {n M : ℕ} (hn : 0 < n) (hM : 0 < M) (Φ : Matrix (Fin n) (Fin M) ℝ) (f : Fin n → ℝ) (hNorm : DictNormalized Φ) (θ' : Fin M → ℝ) (k : ℕ) (hk : 1 ≤ k) :

∃ (θ : Fin M → ℝ), support_size_37 θ ≤ 2 * k ∧ MSE_37 (Φ.mulVec θ) f ≤ MSE_37 (Φ.mulVec θ') f + l1norm_37 θ' ^ 2 / ↑k

Support size is bounded by M #

source

theorem Rigollet.Chapter3.support_size_37_le_M {M : ℕ} (θ : Fin M → ℝ) :

support_size_37 θ ≤ M

Algebraic optimization lemma for Maurey's argument #

For positive reals a, b and k in [√(a/b)/2, √(a/b)], we have a/k + b·k ≤ 3√(ab). This is the key bound for optimizing the Maurey+ℓ₀ tradeoff.

source

theorem Rigollet.Chapter3.opt_bound_sqrt (a b k : ℝ) (ha : 0 < a) (hb : 0 < b) (hk_pos : 0 < k) (hk_le : k ≤ √(a / b)) (hk_ge : √(a / b) / 2 ≤ k) :

a / k + b * k ≤ 3 * √(a * b)

source

theorem Rigollet.Chapter3.floor_ge_half_of_ge_one (x : ℝ) (hx : 1 ≤ x) :

x / 2 ≤ ↑⌊x⌋₊

⌊x⌋ ≥ x/2 when x ≥ 1

Helper lemmas for the three-case bound #

source

theorem Rigollet.Chapter3.six_sqrt_le_of_ge_36 (C : ℝ) (hC : 0 < C) (hC36 : 36 ≤ C) :

6 * √C ≤ C

6√C ≤ C when C ≥ 36

source

theorem Rigollet.Chapter3.sqrt_ratio_eq (R σ C L : ℝ) (hR : 0 < R) (hσ : 0 < σ) (hCL : 0 < C * L) :

√(R ^ 2 / (C * σ ^ 2 * L)) = R / (σ * √(C * L))

√(R²/(Cσ²L)) = R/(σ√(CL)) for positive R, σ, CL

source

theorem Rigollet.Chapter3.maurey_opt_algebra (R σ C L k : ℝ) (hR : 0 < R) (hσ : 0 < σ) (hC : 0 < C) (hL : 0 < L) (hk : 0 < k) (hk_le : k ≤ R / (σ * √(C * L))) (hk_ge : R / (σ * √(C * L)) / 2 ≤ k) (hC36 : 36 ≤ C) :

2 * R ^ 2 / k + 2 * (C * σ ^ 2 * L) * k ≤ C * σ * R * √L

The key algebraic bound: 2R²/k + 2Cσ²kL ≤ CσR√L when k ∈ [kbar/2, kbar] and C ≥ 36. Here kbar = R/(σ√(CL)).

Maurey + three-case optimization (key step of Cor 3.7 proof) #

The proof of Corollary 3.7 derives the ℓ₁ bound from the ℓ₀ bound via:

Maurey's approximation (Thm 3.6): for any θ' and k, ∃θ with |θ|₀ ≤ 2k and MSE(θ) ≤ MSE(θ') + |θ'|₁²/k.
Substituting this θ into the ℓ₀ oracle inequality and optimizing over k.

The bound 2R²/k + 2Cσ²kL ≤ 3√(4CR²σ²L) = 6√C·σR√L ≤ CσR√L (when C ≥ 36) requires k̄ = R/(σ√(CL)) ≥ 1 so that the integer floor ⌊k̄⌋ ≥ 1.

Hypotheses: hC_large ensures the constant absorption works (6√C ≤ C), and hR_lower ensures k̄ ≥ 1 (the "Case 1" regime of the three-case analysis). The complementary case (k̄ < 1, "Case 2") is handled separately in the caller.

Reference: Rigollet, Corollary 3.7 proof, "three cases for k̄".

source

theorem Rigollet.Chapter3.maurey_three_case_bound {n M : ℕ} (_hn : 0 < n) (_hM : 0 < M) (Φ : Matrix (Fin n) (Fin M) ℝ) (f : Fin n → ℝ) (_hNorm : DictNormalized Φ) (σ : ℝ) (hσ : 0 < σ) (C_const : ℝ) (hC : 0 < C_const) (fhat' : Fin n → ℝ) (θ' : Fin M → ℝ) (tail : ℝ) (hMaurey : ∀ (k : ℕ), 1 ≤ k → ∃ (θ : Fin M → ℝ), support_size_37 θ ≤ 2 * k ∧ MSE_37 (Φ.mulVec θ) f ≤ MSE_37 (Φ.mulVec θ') f + l1norm_37 θ' ^ 2 / ↑k) (h_l0 : ∀ (θ : Fin M → ℝ), MSE_37 fhat' f ≤ 2 * MSE_37 (Φ.mulVec θ) f + C_const * σ ^ 2 * ↑(support_size_37 θ) * Real.log (Real.exp 1 * ↑M) / ↑n + tail) (hC_large : 36 ≤ C_const) (hR_lower : σ * √(C_const * (Real.log (Real.exp 1 * ↑M) / ↑n)) ≤ l1norm_37 θ') :

MSE_37 fhat' f ≤ 2 * MSE_37 (Φ.mulVec θ') f + C_const * σ * l1norm_37 θ' * √(Real.log (Real.exp 1 * ↑M) / ↑n) + tail

Bridge lemmas between MSE_34/MSE_37 and support_size_34/support_size_37 #

source

theorem Rigollet.Chapter3.MSE_34_eq_MSE_37 {n : ℕ} (fhat f : Fin n → ℝ) :

MSE_34 fhat f = MSE_37 fhat f

source

theorem Rigollet.Chapter3.support_size_34_eq_37 {M : ℕ} (θ : Fin M → ℝ) :

support_size_34 θ = support_size_37 θ

Cauchy-Schwarz helper for finite sums #

source

theorem Rigollet.Chapter3.abs_sum_mul_le_sqrt_mul_sqrt {n : ℕ} (f g : Fin n → ℝ) :

|∑ i : Fin n, f i * g i| ≤ √(∑ i : Fin n, f i ^ 2) * √(∑ i : Fin n, g i ^ 2)

Cauchy-Schwarz for finite sums: |∑ fᵢgᵢ| ≤ √(∑fᵢ²) · √(∑gᵢ²)

Corollary 3.7: Probabilistic combined ℓ₀/ℓ₁ oracle inequality #

This is the main result. It wraps the deterministic algebraic core with the probabilistic content from Theorem 3.4, specialized to α = 1/3.

source

theorem Rigollet.Chapter3.cor_3_7_probabilistic {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n M : ℕ} (hn : 0 < n) (hM : 0 < M) (Φ : Matrix (Fin n) (Fin M) ℝ) (f : Fin n → ℝ) (ε : Ω → Fin n → ℝ) (θhat : Ω → Fin M → ℝ) (σ : ℝ) (hσ : 0 < σ) (δ : ℝ) (hδ_pos : 0 < δ) (hδ_le : δ ≤ 1) (hsubG : IsSubGaussianNoiseMGF μ ε (σ ^ 2)) (hBIC : ∀ (ω : Ω) (θ : Fin M → ℝ), (f + ε ω - Φ.mulVec (θhat ω)) ⬝ᵥ (f + ε ω - Φ.mulVec (θhat ω)) + 16 * σ ^ 2 / (1 / 3) * Real.log (6 * Real.exp 1 * ↑M) * ↑(support_size_34 (θhat ω)) ≤ (f + ε ω - Φ.mulVec θ) ⬝ᵥ (f + ε ω - Φ.mulVec θ) + 16 * σ ^ 2 / (1 / 3) * Real.log (6 * Real.exp 1 * ↑M) * ↑(support_size_34 θ)) (hNorm : DictNormalized Φ) :

∃ (C_const : ℝ), 0 < C_const ∧ μ {ω : Ω | ∀ (θ : Fin M → ℝ), MSE_37 (Φ.mulVec (θhat ω)) f ≤ 2 * MSE_37 (Φ.mulVec θ) f + min (C_const * σ ^ 2 * ↑(support_size_37 θ) * (Real.log (Real.exp 1 * ↑M) / ↑n)) (C_const * σ * l1norm_37 θ * √(Real.log (Real.exp 1 * ↑M) / ↑n)) + C_const * σ ^ 2 * Real.log (1 / δ) / ↑n} ≥ ENNReal.ofReal (1 - δ)

Corollary 3.7 (Rigollet). With probability at least 1 − δ, the BIC estimator θ̂^BIC satisfies:

MSE(φ_{θ̂^BIC}) ≤ inf_θ { 2 · MSE(φ_θ) + C · [σ² |θ|₀ log(eM)/n ∧ σ |θ|₁ √(log(eM)/n)] } + C · σ² log(1/δ)/n

The proof specializes Theorem 3.4 to α = 1/3 (giving the factor 2 oracle inequality), then uses Maurey's argument (Theorem 3.6) to derive the ℓ₁ bound, and combines both via the min.

Hypotheses:

(Ω, μ) is a probability space
ε is sub-Gaussian noise with variance proxy σ²
θ̂ is the BIC estimator (minimizes |Y − Φθ|² + nτ²|θ|₀)
The dictionary is normalized: max_j |φ_j|₂ ≤ √n
0 < δ ≤ 1 is the confidence parameter

Documentation

Atlas.HighDimensionalStatistics.code.Chapter3.Cor_3_7

Corollary 3.7: Combined ℓ₀/ℓ₁ Oracle Inequality via Maurey #

Local definitions #

Algebraic core #

Bridge lemmas between Chapter3 and Rigollet.Chapter3 definitions #

Maurey's approximation (Theorem 3.6) #

Support size is bounded by M #

Algebraic optimization lemma for Maurey's argument #

Helper lemmas for the three-case bound #

Maurey + three-case optimization (key step of Cor 3.7 proof) #

Bridge lemmas between MSE_34/MSE_37 and support_size_34/support_size_37 #

Cauchy-Schwarz helper for finite sums #

Corollary 3.7: Probabilistic combined ℓ₀/ℓ₁ oracle inequality #