Problem 5.4: σ²d/n is the Minimax Rate Over Various Parameter Sets — Exercise #
From Chapter 5.5 of Rigollet's "High Dimensional Statistics."
Exercise for the reader. This problem is left as an exercise in the
textbook and does not need to be fully proven in the formalization.
The sorry instances below are intentional — each part requires composing
the Varshamov-Gilbert construction with Theorem 5.11, which mirrors the
Corollary 5.13 proof but for different parameter sets.
Problem 5.4. Show that the rate φ = σ²d/n is the minimax rate of estimation in the Gaussian sequence model over: (a) the Euclidean ball of ℝ^d with radius σ²d/n, (b) the unit ℓ∞ ball of ℝ^d (as long as σ² ≤ n), (c) the set of nonnegative vectors of ℝ^d, (d) the discrete hypercube (σ/(16√n)){0,1}^d.
All four parts follow the same strategy as Corollary 5.13:
- Upper bound: The identity estimator θ̂^LS = Y achieves risk exactly σ²d/n at every θ ∈ ℝ^d. Since each of the four sets Θ ⊆ ℝ^d, the sup-risk over Θ is at most σ²d/n.
- Lower bound: The Varshamov-Gilbert construction from Corollary 5.13 produces hypotheses θ₁, …, θ_M that lie in each of these sets (with appropriate parameter choices). Theorem 5.11 then gives the matching lower bound.
The four parameter sets from Problem 5.4 #
Problem 5.4(a): Minimax rate σ²d/n over the Euclidean ball #
The Euclidean ball B₂(r) with r = σ²d/n contains the Varshamov-Gilbert hypotheses θⱼ = ωⱼ · βσ/√n, since ‖θⱼ‖₂² = β²σ²|ωⱼ|₀/n ≤ β²σ²d/n ≤ σ²d/n. The identity estimator achieves rate σ²d/n.
Problem 5.4(a). The minimax rate of estimation over B₂(σ²d/n) is σ²d/n.
Lower bound: The VG hypotheses θⱼ = ωⱼ · βσ/√n satisfy ‖θⱼ‖₂² ≤ σ²d/n (since β ≤ 1), so they lie in B₂(σ²d/n). Theorem 5.11 gives the lower bound.
Upper bound: Since B₂(σ²d/n) ⊆ ℝ^d, the identity estimator has risk E_θ[‖Y − θ‖₂²] = σ²d/n for every θ.
Problem 5.4(b): Minimax rate σ²d/n over the ℓ∞ ball (when σ² ≤ n) #
The ℓ∞ ball B∞(1) contains the VG hypotheses θⱼ = ωⱼ · βσ/√n when βσ/√n ≤ 1, i.e., when σ² ≤ n/β². Since β can be chosen ≤ 1, the condition σ² ≤ n suffices.
Problem 5.4(b). The minimax rate of estimation over B∞(1) is σ²d/n (when σ² ≤ n).
Lower bound: The condition σ² ≤ n ensures each coordinate of the VG hypotheses satisfies |θⱼᵢ| = βσ/√n ≤ 1, so they lie in B∞(1).
Upper bound: Since B∞(1) ⊆ ℝ^d, the identity estimator achieves σ²d/n.
Problem 5.4(c): Minimax rate σ²d/n over the nonnegative orthant #
The nonnegative orthant ℝ₊^d contains the VG hypotheses θⱼ = ωⱼ · βσ/√n trivially, since each coordinate is either 0 or βσ/√n > 0.
Problem 5.4(c). The minimax rate of estimation over ℝ₊^d is σ²d/n.
Lower bound: The VG hypotheses have coordinates in {0, βσ/√n} ⊆ [0, ∞), so they lie in the nonnegative orthant.
Upper bound: Since ℝ₊^d ⊆ ℝ^d, the identity estimator achieves σ²d/n.
Problem 5.4(d): Minimax rate σ²d/n over the scaled discrete hypercube #
The scaled hypercube (σ/(16√n)){0,1}^d is exactly the set of VG hypotheses when β = 1/16. Each θⱼ has coordinates in {0, σ/(16√n)}, so θⱼ lies in the scaled hypercube by definition.
Problem 5.4(d). The minimax rate over (σ/(16√n)){0,1}^d is σ²d/n.
Lower bound: With β = 1/16, the VG hypotheses θⱼ = ωⱼ · σ/(16√n) lie exactly in the scaled hypercube, and Theorem 5.11 gives the lower bound.
Upper bound: Since the scaled hypercube ⊆ ℝ^d, the identity estimator achieves σ²d/n.
Problem 5.4 (combined). The rate φ = σ²d/n is the minimax rate of estimation in the GSM over: (a) B₂(σ²d/n), (b) B∞(1) (when σ² ≤ n), (c) ℝ₊^d, (d) (σ/(16√n)){0,1}^d.
For each set Θ, there exist constants C, C' > 0 such that: C' · σ²d/n ≤ inf_{θ̂} sup_{θ∈Θ} E_θ[‖θ̂−θ‖₂²] ≤ C · σ²d/n.