Problem 2.6: Modified BIC estimator #

From Rigollet Chapter 2, Problem 2.6.

The standard BIC estimator (Definition 2.12) uses a uniform penalty τ²|θ|₀, which yields an MSE bound of order |θ*|₀ · σ² · log(ed) / n (Theorem 2.14). The factor log(ed) does not depend on the true sparsity |θ*|₀.

The modified BIC estimator replaces the uniform penalty with an adaptive penalty λ|θ|₀ · log(ed/|θ|₀) that depends on the sparsity level of the candidate θ itself. This yields the sharper rate:

MSE ≲ |θ*|₀ · σ² · log(ed / |θ*|₀) / n

which matches the ℓ₀-constrained LS rate of Corollary 2.9 (up to constants), eliminating the gap between log(ed) and log(ed/k). This shows the modified BIC can match the oracle benchmark without knowing the true sparsity k = |θ*|₀.

The proof follows the same "sup-out" technique as Theorem 2.14, but uses the refined combinatorial bound log(C(d,k)) ≤ k · log(ed/k) from Lemma 2.7.

source

noncomputable def modifiedBICObjective {n d : ℕ} (X : Matrix (Fin n) (Fin d) ℝ) (Y : Fin n → ℝ) (lam : ℝ) (θ : Fin d → ℝ) :

ℝ

The modified BIC objective: (1/n)|Y - Xθ|₂² + λ · |θ|₀ · log(ed/|θ|₀).

Instances For

source

def IsModifiedBICEstimator {n d : ℕ} (X : Matrix (Fin n) (Fin d) ℝ) (Y : Fin n → ℝ) (lam : ℝ) (θhat : Fin d → ℝ) :

Prop

θ̂ is a modified BIC estimator if it minimizes the modified BIC objective.

Instances For

source

theorem problem_2_6_modified_BIC_rate {n d : ℕ} (hn : 0 < n) (hd : 0 < d) (X : Matrix (Fin n) (Fin d) ℝ) (θstar : Fin d → ℝ) (eps : Fin n → ℝ) (hne : Rigollet.l0norm θstar ≠ 0) (σsq : ℝ) (hσ : 0 < σsq) (lam : ℝ) (θhat : Fin d → ℝ) (hmod : IsModifiedBICEstimator X (X.mulVec θstar + eps) lam θhat) (hnoise : ∀ (j : Fin d), |∑ i : Fin n, X i j * eps i| ≤ ↑n * √(2 * σsq * Real.log (2 * ↑d) / ↑n)) :

1 / ↑n * X.mulVec (θhat - θstar) ⬝ᵥ X.mulVec (θhat - θstar) ≤ 576 * ↑(Rigollet.l0norm θstar) * σsq * Real.log (Real.exp 1 * ↑d / ↑(Rigollet.l0norm θstar)) / ↑n

Problem 2.6. Under sub-Gaussian noise, the modified BIC estimator with appropriately chosen λ satisfies: MSE(Xθ̂) = (1/n)|X(θ̂ - θ*)|₂² ≤ C · |θ*|₀ · σ² · log(ed/|θ*|₀) / n with probability at least 0.99.

Documentation

Atlas.HighDimensionalStatistics.code.Chapter2.Problem_2_6

Problem 2.6: Modified BIC estimator #