Documentation

Atlas.HighDimensionalStatistics.code.Chapter2.Problem_2_6

Problem 2.6: Modified BIC estimator #

From Rigollet Chapter 2, Problem 2.6.

The standard BIC estimator (Definition 2.12) uses a uniform penalty τ²|θ|₀, which yields an MSE bound of order |θ*|₀ · σ² · log(ed) / n (Theorem 2.14). The factor log(ed) does not depend on the true sparsity |θ*|₀.

The modified BIC estimator replaces the uniform penalty with an adaptive penalty λ|θ|₀ · log(ed/|θ|₀) that depends on the sparsity level of the candidate θ itself. This yields the sharper rate:

MSE ≲ |θ*|₀ · σ² · log(ed / |θ*|₀) / n

which matches the ℓ₀-constrained LS rate of Corollary 2.9 (up to constants), eliminating the gap between log(ed) and log(ed/k). This shows the modified BIC can match the oracle benchmark without knowing the true sparsity k = |θ*|₀.

The proof follows the same "sup-out" technique as Theorem 2.14, but uses the refined combinatorial bound log(C(d,k)) ≤ k · log(ed/k) from Lemma 2.7.

noncomputable def modifiedBICObjective {n d : } (X : Matrix (Fin n) (Fin d) ) (Y : Fin n) (lam : ) (θ : Fin d) :

The modified BIC objective: (1/n)|Y - Xθ|₂² + λ · |θ|₀ · log(ed/|θ|₀).

Instances For
    def IsModifiedBICEstimator {n d : } (X : Matrix (Fin n) (Fin d) ) (Y : Fin n) (lam : ) (θhat : Fin d) :

    θ̂ is a modified BIC estimator if it minimizes the modified BIC objective.

    Instances For
      theorem problem_2_6_modified_BIC_rate {n d : } (hn : 0 < n) (hd : 0 < d) (X : Matrix (Fin n) (Fin d) ) (θstar : Fin d) (eps : Fin n) (hne : Rigollet.l0norm θstar 0) (σsq : ) ( : 0 < σsq) (lam : ) (θhat : Fin d) (hmod : IsModifiedBICEstimator X (X.mulVec θstar + eps) lam θhat) (hnoise : ∀ (j : Fin d), |i : Fin n, X i j * eps i| n * (2 * σsq * Real.log (2 * d) / n)) :
      1 / n * X.mulVec (θhat - θstar) ⬝ᵥ X.mulVec (θhat - θstar) 576 * (Rigollet.l0norm θstar) * σsq * Real.log (Real.exp 1 * d / (Rigollet.l0norm θstar)) / n

      Problem 2.6. Under sub-Gaussian noise, the modified BIC estimator with appropriately chosen λ satisfies: MSE(Xθ̂) = (1/n)|X(θ̂ - θ*)|₂² ≤ C · |θ*|₀ · σ² · log(ed/|θ*|₀) / n with probability at least 0.99.