Atlas.HighDimensionalStatistics.code.Chapter3.Thm_3

L₂ norm on [0,1] #

source

noncomputable def Chapter3.L2normSq (g : ℝ → ℝ) :

ℝ

The L₂([0,1]) squared norm: ‖g‖²_{L₂} = ∫₀¹ g(x)² dx

Instances For

source

noncomputable def Chapter3.empiricalNormSq (n : ℕ) (g : ℝ → ℝ) :

ℝ

The empirical squared norm: ‖g‖²_n = (1/n) Σᵢ g(Xᵢ)² with Xᵢ = (i-1)/n

Instances For

Sub-Gaussian noise predicate #

source

structure Chapter3.IsSubGaussianNoiseVec {Ω : Type u_1} [MeasurableSpace Ω] {n : ℕ} (ε : Ω → Fin n → ℝ) (σsq : ℝ) (μ : MeasureTheory.Measure Ω) :

Prop

Sub-Gaussian noise: each component εᵢ satisfies the MGF bound E[exp(s·εᵢ)] ≤ exp(σ²s²/2), and the associated integrability conditions. This is ε ~ subG_n(σ²) from the book.

bound (i : Fin n) (s : ℝ) : ∫ (ω : Ω), Real.exp (s * ε ω i) ∂μ ≤ Real.exp (σsq * s ^ 2 / 2)
integ (i : Fin n) : MeasureTheory.Integrable (fun (ω : Ω) => ε ω i) μ
exp_integ (i : Fin n) (s : ℝ) : MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ε ω i)) μ
weighted_sum_bound (w : Fin n → ℝ) (s : ℝ) : ∫ (ω : Ω), Real.exp (s * ∑ i : Fin n, w i * ε ω i) ∂μ ≤ Real.exp ((σsq * ∑ i : Fin n, w i ^ 2) * s ^ 2 / 2)

Instances For

source

@[implicit_reducible]

instance Chapter3.instCoeFunIsSubGaussianNoiseVecForallForallLeRealIntegralExpHMulHDivHPowOfNat {Ω : Type u_1} [MeasurableSpace Ω] {n : ℕ} {ε : Ω → Fin n → ℝ} {σsq : ℝ} {μ : MeasureTheory.Measure Ω} :

CoeFun (IsSubGaussianNoiseVec ε σsq μ) fun (x : IsSubGaussianNoiseVec ε σsq μ) => ∀ (i : Fin n) (s : ℝ), ∫ (ω : Ω), Real.exp (s * ε ω i) ∂μ ≤ Real.exp (σsq * s ^ 2 / 2)

Coercion so that hsubG i s still works as shorthand for hsubG.bound i s.

Regression model and estimator #

source

noncomputable def Chapter3.trigLinComb (M : ℕ) (θ : Fin M → ℝ) (x : ℝ) :

ℝ

The trigonometric linear combination: φ_θ(x) = Σⱼ θⱼ φⱼ(x)

Instances For

source

def Chapter3.IsLSEstimator {Ω : Type u_1} {n M : ℕ} (Y : Ω → Fin n → ℝ) (θhat : Ω → Fin M → ℝ) :

Prop

The LS estimator: minimizes (1/n) Σᵢ (Yᵢ - φ_θ(Xᵢ))² over θ ∈ ℝᴹ, with Xᵢ = (i-1)/n (regular design).

Instances For

Key algebraic lemma: the exponent computation #

source

theorem Chapter3.rpow_exponent_combine (n β : ℝ) (hn : 0 < n) :

(n ^ (1 / (2 * β + 1))) ^ (-(2 * β)) = n ^ (-(2 * β) / (2 * β + 1))

Key computation: (n^{1/(2β+1)})^{-2β} = n^{-2β/(2β+1)}

source

theorem Chapter3.rpow_M_div_n (n β : ℝ) (hn : 0 < n) (hβ : 0 < β) :

n ^ (1 / (2 * β + 1)) / n = n ^ (-(2 * β) / (2 * β + 1))

Key computation: n^{1/(2β+1)} / n = n^{-2β/(2β+1)}

Oracle inequality in L₂ form (helper for Theorem 3.15) #

The oracle inequality for the LS estimator (Theorem 3.3) combined with the orthogonality of the trigonometric basis (Lemma 3.13, ORT) gives:

With probability ≥ 1 − δ: ‖φ_{θ̂^LS} − φ_{θ*}^M‖²_{L₂} ≤ C₁ · (n^{1−2β} + σ²(M + log(1/δ))/n)

This is obtained from (3.12) by:

Choosing α = 1/2
Using Lemma 3.13: empirical norm = n · L₂ norm for trig basis
Bounding the infimum (bias term) using the Sobolev class bound

Book proof lines 2496–2498. The proof of this helper combines:

The matrix-form oracle inequality from Thm_3_3.lean
The orthogonality result from Lemma_3_13.lean
Converting between empirical and L₂ norms These results are formalized in other files but with different type signatures (Matrix/dotProduct vs continuous L₂/integral). The type-bridging infrastructure to connect them is not available, so this is stated with sorry representing the established oracle inequality in L₂ form.

ORT norm equivalence for trigonometric linear combinations #

For the trigonometric basis on the regular grid X_i = i/n with M ≤ n-1, the L₂ norm and empirical norm coincide for any trigonometric linear combination. Specifically: ∫₀¹ (Σⱼ θⱼ φⱼ(x))² dx = (1/n) Σᵢ (Σⱼ θⱼ φⱼ(i/n))².

This is a direct consequence of Lemma 3.13 (ORT: ΦᵀΦ = nI), which implies that the Gram matrix of the design matrix is n times the identity. For any coefficient vector θ, the empirical squared norm (1/n)|Φθ|² equals ‖θ‖²_{ℓ₂}, and by orthonormality of the trigonometric basis in L₂([0,1]), ∫₀¹ (Σⱼ θⱼ φⱼ(x))² dx also equals ‖θ‖²_{ℓ₂}.

The proof requires bridging between Matrix/dotProduct types (from Lemma_3_13.lean) and continuous L₂/integral types (used here). This type-bridging infrastructure is axiomatized since the mathematical content follows directly from Lemma 3.13 but the formalization uses incompatible type representations.

Book line 2485: "Lemma 3.13 implies |φ − φ|₂² = n ‖φ − φ‖²_{L₂([0,1])}".

source

theorem Chapter3.trigDesignMatrix_eq_trigBasis (n M : ℕ) (hn : 0 < n) (i : Fin n) (j : Fin M) :

trigDesignMatrix n M i j = trigBasis (↑j + 1) (↑↑i / ↑n)

Helper: the connection between trigBasis and trigDesignMatrix entries. trigDesignMatrix n M i j = trigBasis (j.val + 1) (i / n)

source

theorem Chapter3.empirical_norm_eq_sum_sq (n M : ℕ) (hn : 0 < n) (hM : M ≤ n - 1) (θ : Fin M → ℝ) :

1 / ↑n * ∑ i : Fin n, (∑ j : Fin M, θ j * trigBasis (↑j + 1) (↑↑i / ↑n)) ^ 2 = ∑ j : Fin M, θ j ^ 2

Helper: the empirical norm of a trig linear combination equals the ℓ² norm of θ. (1/n) ∑ᵢ (∑ⱼ θⱼ φⱼ(xᵢ))² = ∑ⱼ (θⱼ)² This follows from Lemma 3.13: ΦᵀΦ = nI.

source

theorem Chapter3.trigBasis_L2_orthonormal (j j' : ℕ) (hj : 0 < j) (hj' : 0 < j') :

∫ (x : ℝ) in Set.Icc 0 1, trigBasis j x * trigBasis j' x = if j = j' then 1 else 0

Helper: L₂ orthonormality of trigBasis. The trigonometric basis {φ₁, φ₂, ...} is orthonormal in L₂([0,1]): ∫₀¹ φⱼ(x) φⱼ'(x) dx = δⱼⱼ' for j, j' ≥ 1

This follows from standard properties of trigonometric functions:

∫₀¹ 1² dx = 1
∫₀¹ 2·cos²(2πkx) dx = 1
∫₀¹ 2·sin²(2πkx) dx = 1
∫₀¹ cos(2πjx)·sin(2πkx) dx = 0
∫₀¹ cos(2πjx)·cos(2πkx) dx = 0 for j ≠ k
∫₀¹ sin(2πjx)·sin(2πkx) dx = 0 for j ≠ k

The book references this as a standard fact (Example 3.8). Proving all these integration results from scratch is extremely tedious. We axiomatize L₂ orthonormality of trigBasis since the book does not prove it explicitly — it is cited as a well-known result from Fourier analysis.

source

theorem Chapter3.L2_norm_eq_sum_sq (M : ℕ) (θ : Fin M → ℝ) :

(L2normSq fun (x : ℝ) => ∑ j : Fin M, θ j * trigBasis (↑j + 1) x) = ∑ j : Fin M, θ j ^ 2

The L₂ norm of a trig linear combination equals the ℓ² norm of θ. ∫₀¹ (∑ⱼ θⱼ φⱼ₊₁(x))² dx = ∑ⱼ (θⱼ)² This follows from L₂ orthonormality of trigBasis.

source

theorem Chapter3.ORT_L2_eq_empirical_normSq (n M : ℕ) (hn : 0 < n) (hM : M ≤ n - 1) (θ : Fin M → ℝ) :

(L2normSq fun (x : ℝ) => ∑ j : Fin M, θ j * trigBasis (↑j + 1) x) = 1 / ↑n * ∑ i : Fin n, (∑ j : Fin M, θ j * trigBasis (↑j + 1) (↑↑i / ↑n)) ^ 2

source

theorem Chapter3.trigLinComb_eq_mulVec (n M : ℕ) (hn : 0 < n) (θ : Fin M → ℝ) (i : Fin n) :

trigLinComb M θ (↑↑i / ↑n) = (trigDesignMatrix n M).mulVec θ i

Bridge: trigLinComb at grid point equals matrix-vector product. trigLinComb M θ (i/n) = (trigDesignMatrix n M).mulVec θ i

source

theorem Chapter3.finset_range_eq_fin_sum (M : ℕ) (θstar : ℕ → ℝ) (x : ℝ) :

∑ j ∈ Finset.range M, θstar j * trigBasis (j + 1) x = ∑ j : Fin M, θstar ↑j * trigBasis (↑j + 1) x

Bridge: Finset.range M sum with θstar equals Fin M sum. Converts ∑ j ∈ Finset.range M, θstar j * trigBasis (j+1) x to ∑ j : Fin M, θstar j.val * trigBasis (j.val+1) x.

source

theorem Chapter3.empirical_diff_eq_coeff_sq (n M : ℕ) (hn : 0 < n) (hM : M ≤ n - 1) (θ₁ θ₂ : Fin M → ℝ) :

1 / ↑n * ∑ i : Fin n, (∑ j : Fin M, θ₁ j * trigBasis (↑j + 1) (↑↑i / ↑n) - ∑ j : Fin M, θ₂ j * trigBasis (↑j + 1) (↑↑i / ↑n)) ^ 2 = ∑ j : Fin M, (θ₁ j - θ₂ j) ^ 2

The empirical difference norm as a sum of coefficient differences. Shows that the empirical norm of the difference between two trig linear combinations equals the ℓ² norm of their coefficient differences. Key bridge: by empirical_norm_eq_sum_sq (from Lemma 3.13).

source

theorem Chapter3.trigBasis_col_sq_sum (n M : ℕ) (hn : 0 < n) (hM : M ≤ n - 1) (j : Fin M) :

∑ i : Fin n, trigBasis (↑j + 1) (↑↑i / ↑n) ^ 2 = ↑n

Sum of squares of trigBasis column values equals n. From Lemma 3.13 (ΦᵀΦ = nI), the (j,j) diagonal entry gives ∑ᵢ trigBasis(j+1)(i/n)² = n.

source

theorem Chapter3.ls_noise_coord_subG_tail {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (M : ℕ) (_hM_pos : 0 < M) (hM_le : M ≤ n - 1) (σ : ℝ) (hσ : 0 < σ) (ε : Ω → Fin n → ℝ) (_hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (hmgf : ∀ (w : Fin n → ℝ) (s : ℝ), ∫ (ω : Ω), Real.exp (s * ∑ i : Fin n, ε ω i * w i) ∂μ ≤ Real.exp ((σ ^ 2 * ∑ i : Fin n, w i ^ 2) * s ^ 2 / 2)) (hintegrable : ∀ (w : Fin n → ℝ) (s : ℝ), MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (s * ∑ i : Fin n, ε ω i * w i)) μ) (j : Fin M) (t : ℝ) (ht : 0 < t) :

μ {ω : Ω | |∑ i : Fin n, ε ω i * trigBasis (↑j + 1) (↑↑i / ↑n)| / ↑n ≥ t} ≤ ENNReal.ofReal (2 * Real.exp (-↑n * t ^ 2 / (2 * σ ^ 2)))

Per-coordinate sub-Gaussian tail: for each j, the noise contribution (1/n) Σᵢ εᵢ φⱼ(xᵢ) satisfies a sub-Gaussian tail bound.

source

theorem Chapter3.subG_chi2_complement_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (M : ℕ) (hM_pos : 0 < M) (Z : Fin M → Ω → ℝ) (v : ℝ) (hv : 0 < v) (hZ_subG : ∀ (j : Fin M) (s : ℝ), ∫ (ω : Ω), Real.exp (s * Z j ω) ∂μ ≤ Real.exp (v * s ^ 2 / 2)) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ ≤ 1) :

μ {ω : Ω | 4 * ↑M * v * (↑M + Real.log (1 / δ)) < ∑ j : Fin M, Z j ω ^ 2} ≤ ENNReal.ofReal δ

Sub-Gaussian complement bound (helper from Theorem 1.19 proof).

For M sub-Gaussian random variables with variance proxy v and threshold T = 4Mv(M + log(1/δ)), the complement {\u03c9 | ∑ Zⱼ² > T} has measure ≤ δ. This is the core probabilistic estimate from the proof of Theorem 1.19 (book lines 902–922), using per-coordinate Chernoff bounds, pigeonhole, and a union bound. The book proves this in Chapter 2 via the sub-Gaussian max inequality following equation (2.5).

source

theorem Chapter3.subG_chi2_concentration_ch2 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (M : ℕ) (hM_pos : 0 < M) (Z : Fin M → Ω → ℝ) (v : ℝ) (hv : 0 < v) (hZ_subG : ∀ (j : Fin M) (s : ℝ), ∫ (ω : Ω), Real.exp (s * Z j ω) ∂μ ≤ Real.exp (v * s ^ 2 / 2)) :

∃ (C_chi2 : ℝ), 0 < C_chi2 ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | ∑ j : Fin M, Z j ω ^ 2 ≤ C_chi2 * v * (↑M + Real.log (1 / δ))} ≥ ENNReal.ofReal (1 - δ)

Sub-Gaussian chi-squared concentration (Theorem 1.19 / equation 2.5).

For M sub-Gaussian random variables Z_j with variance proxy v, P(∑ⱼ Zⱼ² ≤ C · v · (M + log(1/δ))) ≥ 1 - δ

Proved from the complement bound subG_chi2_complement_bound (which encodes the per-coordinate Chernoff + pigeonhole + union bound argument from Theorem 1.19, book lines 902–922) combined with standard probability complement arithmetic.

Book reference: equation (2.5) and Theorem 1.19, cited in the proof of Theorem 3.3.

source

theorem Chapter3.ls_coeff_subG_mgf_ch2 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (σ : ℝ) (hσ : 0 < σ) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (f : ℝ → ℝ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (θbar : Fin M → ℝ) (hθbar : ∀ (j : Fin M), θbar j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) :

∃ (C₀ : ℝ), 0 < C₀ ∧ ∀ (j : Fin M) (s : ℝ), ∫ (ω : Ω), Real.exp (s * (θhat ω j - θbar j)) ∂μ ≤ Real.exp (C₀ * (σ ^ 2 / ↑n) * s ^ 2 / 2)

LS coefficient decomposition (from Chapter 2 / Lemma 3.13).

On the orthogonal trigonometric design (Lemma 3.13, ΦᵀΦ = nI), the LS estimator θ̂ decomposes as: θ̂ⱼ - θ̄ⱼ = (1/n) ∑ᵢ εᵢ φⱼ(xᵢ) when θ̄ⱼ = (1/n) ∑ᵢ f(xᵢ) φⱼ(xᵢ) (the empirical Fourier coefficient).

The sub-Gaussian MGF bound for (1/n) ∑ᵢ εᵢ φⱼ(xᵢ) with variance proxy σ²/n follows from the componentwise sub-Gaussian property and the orthogonality relation ∑ᵢ φⱼ(xᵢ)² = n.

The proof requires deriving the closed-form LS solution on orthogonal designs (first-order optimality → normal equations → θ̂ⱼ = (1/n) ∑ᵢ Yᵢ φⱼ(xᵢ)), then applying the weighted_sum_bound field of IsSubGaussianNoiseVec. The book states this as following from "the same steps as those following equation (2.5)" — a cross-chapter reference to Chapter 2 techniques.

source

theorem Chapter3.subG_trig_concentration {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (f : ℝ → ℝ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (θbar : Fin M → ℝ) (hθbar : ∀ (j : Fin M), θbar j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) :

∃ (C_conc : ℝ), 0 < C_conc ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | ∑ j : Fin M, (θhat ω j - θbar j) ^ 2 ≤ C_conc * σ ^ 2 * (↑M + Real.log (1 / δ)) / ↑n} ≥ ENNReal.ofReal (1 - δ)

Sub-Gaussian concentration for trig LS estimator.

Under sub-Gaussian noise and the orthogonal trigonometric design (Lemma 3.13), the LS estimator θ̂ satisfies: with probability ≥ 1-δ, ∑ⱼ (θ̂ⱼ - θ*ⱼ)² ≤ C · σ² · (M + log(1/δ)) / n

This follows from the sub-Gaussian max inequality ("the same steps as those following equation (2.5)" in the book — a cross-chapter reference to Chapter 2). The orthogonal design ΦᵀΦ = nI (Lemma 3.13) simplifies the bound. The proof combines two axiomatized helpers from Chapter 2:

ls_coeff_subG_mgf_ch2: MGF bound for LS coefficient errors
subG_chi2_concentration_ch2: chi-squared concentration for sub-Gaussians

Note: This is stated in ℓ² coefficient space rather than in matrix form, since by empirical_norm_eq_sum_sq, ∑ⱼ (θ̂ⱼ - θ*ⱼ)² equals the empirical squared norm (1/n) ∑ᵢ (φ_θ̂(xᵢ) - φ_θ*(xᵢ))².

Book reference: Theorem 3.3 proof, citing equation (2.5) from Chapter 2.

source

theorem Chapter3.oracle_ineq_empirical_form {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) :

∃ (C₁ : ℝ), 0 < C₁ ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | 1 / ↑n * ∑ i : Fin n, (trigLinComb M (θhat ω) (↑↑i / ↑n) - ∑ j ∈ Finset.range M, θstar j * trigBasis (j + 1) (↑↑i / ↑n)) ^ 2 ≤ C₁ * (↑n ^ (1 - 2 * β) + σ ^ 2 * (↑M + Real.log (1 / δ)) / ↑n)} ≥ ENNReal.ofReal (1 - δ)

Oracle inequality in empirical norm form (equation 3.12).

Proved from subG_trig_concentration and sobolev_bias_bound. For the LS estimator on the trigonometric basis with sub-Gaussian noise and Sobolev class regression function, the empirical estimation error satisfies with probability ≥ 1-δ: (1/n) Σᵢ (φ_{θ̂}(xᵢ) - φ_{θ*}(xᵢ))² ≤ C₁ · (n^{1-2β} + σ²(M + log(1/δ))/n)

Book reference: equation (3.12), lines 2478-2483.

source

theorem Chapter3.oracle_inequality_L2_form {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) :

∃ (C₁ : ℝ), 0 < C₁ ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | (L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - ∑ j ∈ Finset.range M, θstar j * trigBasis (j + 1) x) ≤ C₁ * (↑n ^ (1 - 2 * β) + σ ^ 2 * (↑M + Real.log (1 / δ)) / ↑n)} ≥ ENNReal.ofReal (1 - δ)

source

theorem Chapter3.fourier_L2_completeness_aux (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (β Q : ℝ) (hβ : 0 < β) (hQ : 0 < Q) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (hSqSum : Summable fun (j : ℕ) => θstar j ^ 2) :

∀ᵐ (x : ℝ) ∂MeasureTheory.volume.restrict (Set.Icc 0 1), f x = ∑' (j : ℕ), θstar j * trigBasis (j + 1) x

Fourier L² completeness helper (axiom).

This axiom encapsulates the deep result from harmonic analysis that a function whose Fourier coefficients are square-summable equals its Fourier series a.e. on [0,1]. The book does not prove this; it is stated as a foundational fact at lines 2293–2299 (the trigonometric system forms a complete orthonormal basis of L²([0,1])).

source

theorem Chapter3.fourier_L2_completeness (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (β Q : ℝ) (hβ : 0 < β) (hQ : 0 < Q) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) :

∀ᵐ (x : ℝ) ∂MeasureTheory.volume.restrict (Set.Icc 0 1), f x = ∑' (j : ℕ), θstar j * trigBasis (j + 1) x

Fourier L² completeness.

For a function f with Fourier coefficients θstar satisfying the Sobolev regularity condition, the function f equals its Fourier series representation ∑' j, θstar j · φ_{j+1} a.e. on [0,1].

This is a deep result from Fourier analysis (L² convergence of Fourier series for L² functions). The book does not prove this; it is a foundational fact from harmonic analysis that the trigonometric system {φ_j}_{j≥1} forms an orthonormal basis of L²([0,1]), and hence the Fourier series converges in L² to the function, which implies a.e. equality.

Note: The pointwise a.e. equality is slightly stronger than the L² convergence statement ‖f − ∑' θ_j φ_j‖_{L²} = 0, but follows from it for L² functions by standard measure theory (∫ g² = 0 and g integrable ⟹ g = 0 a.e.). We state the a.e. form directly since it is more convenient for the proof of approx_error_L2_form.

source

theorem Chapter3.fourier_tail_L2_bound (M : ℕ) (hM : 0 < M) (β Q : ℝ) (hβ : 1 / 2 < β) (hQ : 0 < Q) (θstar : ℕ → ℝ) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) :

(L2normSq fun (x : ℝ) => ∑' (j : ℕ), θstar j * trigBasis (j + 1) x - ∑ j ∈ Finset.range M, θstar j * trigBasis (j + 1) x) ≤ Q * ↑M ^ (-(2 * β))

Fourier tail L² bound under Sobolev regularity (axiom).

For Fourier coefficients θstar satisfying the Sobolev regularity condition ∀ K, ∑{j<K} (sobolevCoeff β (j+1))² · θstar(j)² ≤ Q, the L² norm of the Fourier series tail from index M onwards is bounded: ∫₀¹ (∑' j, θstar(j) · φ{j+1}(x) − ∑{j<M} θstar(j) · φ{j+1}(x))² dx ≤ Q · M^{−2β}

This combines two sub-results:

Parseval's identity: the L² norm of the tail equals ∑_{j≥M} θstar(j)²
The Cauchy–Schwarz/Sobolev bound: ∑_{j≥M} θstar(j)² ≤ Q · M^{−2β}

The proof proceeds by:

Establishing θstar ∈ SobolevEllipsoidInf β Q from bounded partial sums
Proving summability of θstar(j)² (from Sobolev weights ≥ 1 for j ≥ 1)
Applying Parseval's identity (parseval_truncation_L2_error)
Bounding the tail sum using the Sobolev condition and weight monotonicity

source

theorem Chapter3.approx_error_L2_form (M : ℕ) (hM : 0 < M) (β Q : ℝ) (hβ : 1 / 2 < β) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) :

(L2normSq fun (x : ℝ) => ∑ j ∈ Finset.range M, θstar j * trigBasis (j + 1) x - f x) ≤ Q * ↑M ^ (-(2 * β))

Approximation error bound (Lemma 3.14(i)).

For f in the Sobolev class Θ(β, Q) (with Fourier coefficients θstar), the M-term truncation satisfies: ‖f − φ_{θ*}^M‖²_{L₂} ≤ Q · M^{−2β}

This is the deterministic approximation error from Lemma 3.14, equation (3.10). The proof proceeds in two steps:

By Fourier L² completeness (fourier_L2_completeness), f equals its Fourier series in L²([0,1]), so ‖partial_sum − f‖² = ‖partial_sum − full_series‖² in L².
By the Fourier tail bound (fourier_tail_L2_bound), the tail of the Fourier series satisfies ‖full_series − partial_sum‖² ≤ Q · M^{−2β}.

More precisely, using the L² triangle inequality and the fact that L2normSq(partial_sum − f) ≤ 2 · L2normSq(partial_sum − series) + 2 · L2normSq(series − f), with the second term being 0 by Fourier completeness.

source

theorem Chapter3.approx_error_integrableOn (M : ℕ) (_hM : 0 < M) (β Q : ℝ) (_hβ : 0 < β) (_hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (_hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (_hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

MeasureTheory.IntegrableOn (fun (x : ℝ) => (∑ j ∈ Finset.range M, θstar j * trigBasis (j + 1) x - f x) ^ 2) (Set.Icc 0 1) MeasureTheory.volume

Square-integrability of the truncation error on [0,1].

For f in the Sobolev class Θ(β, Q), the truncation error h(x) = (Σ θ*ⱼ φⱼ)(x) - f(x) is square-integrable on [0,1]. This follows from the Parseval identity: the Sobolev class membership implies f ∈ L²([0,1]), and the finite trig sum is continuous hence square-integrable. Same type-bridging gap as oracle_inequality_L2_form.

source

theorem Chapter3.L2normSq_add_le (g h : ℝ → ℝ) (hg : MeasureTheory.IntegrableOn (fun (x : ℝ) => g x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) (hh : MeasureTheory.IntegrableOn (fun (x : ℝ) => h x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

(L2normSq fun (x : ℝ) => g x + h x) ≤ 2 * (L2normSq g + L2normSq h)

Triangle inequality for L₂ squared norm. ∫₀¹ (g(x) + h(x))² dx ≤ 2 · (∫₀¹ g(x)² dx + ∫₀¹ h(x)² dx) Follows from the pointwise (a + b)² ≤ 2(a² + b²) and monotonicity of integration. Requires square-integrability of g and h on [0,1].

Theorem 3.15: High probability bound #

source

theorem Chapter3.thm_3_15_combined_intermediate {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 1 / 2 < β) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

∃ (C : ℝ), 0 < C ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | (L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) ≤ C * (↑M ^ (-(2 * β)) + ↑n ^ (1 - 2 * β) + (↑M + σ ^ 2 * Real.log (1 / δ)) / ↑n)} ≥ ENNReal.ofReal (1 - δ)

Auxiliary lemma for Theorem 3.15: Combined oracle inequality bound.

From (3.12) (Theorem 3.3 oracle inequality) and Lemma 3.13 (ORT for trigonometric basis), with α = 1/2: with probability ≥ 1 − δ, ‖φ_{θ̂^LS} − f‖²_{L₂} ≤ C₁ · (M^{−2β} + n^{1−2β} + (M + σ²·log(1/δ))/n)

This bundles the oracle inequality (Thm 3.3), orthogonality (Lemma 3.13), and approximation error (Lemma 3.14) into a single probabilistic bound on the full L₂ error. Book lines 2496–2502.

source

theorem Chapter3.thm_3_15_rate_computation (n : ℕ) (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (hβ_lower : β ≥ (1 + √5) / 4) (M : ℕ) (_hM_pos : 0 < M) (hM_choice : ↑n ^ (1 / (2 * β + 1)) ≤ ↑M ∧ ↑M ≤ ↑n ^ (1 / (2 * β + 1)) + 1) (σ : ℝ) (_hσ : 0 < σ) :

∃ (C' : ℝ), 0 < C' ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → ↑M ^ (-(2 * β)) + ↑n ^ (1 - 2 * β) + (↑M + σ ^ 2 * Real.log (1 / δ)) / ↑n ≤ C' * ↑n ^ (-(2 * β) / (2 * β + 1)) + C' * σ ^ 2 * Real.log (1 / δ) / ↑n

Lemma for Theorem 3.15: Rate computation.

For M ≈ n^{1/(2β+1)} and β ≥ (1+√5)/4, the three error terms M^{-2β}, n^{1-2β}, M/n are all ≤ C' · n^{-2β/(2β+1)} for some constant C'. Book lines 2504–2508: plugging in M = ⌈n^{1/(2β+1)}⌉ and using n^{1-2β} ≤ n^{-2β/(2β+1)} for the prescribed β.

source

theorem Chapter3.thm_3_15_high_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (hβ_lower : β ≥ (1 + √5) / 4) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (hM_choice : ↑n ^ (1 / (2 * β + 1)) ≤ ↑M ∧ ↑M ≤ ↑n ^ (1 / (2 * β + 1)) + 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) (δ : ℝ) (hδ_pos : 0 < δ) (hδ_le : δ ≤ 1) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

∃ (C : ℝ), 0 < C ∧ μ {ω : Ω | (L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) ≤ C * ↑n ^ (-(2 * β) / (2 * β + 1)) + C * σ ^ 2 * Real.log (1 / δ) / ↑n} ≥ ENNReal.ofReal (1 - δ)

Theorem 3.15 (High probability bound for LS on Sobolev class).

Fix β ≥ (1+√5)/4, Q > 0, δ > 0 and assume the general regression model Y_i = f(X_i) + ε_i, X_i = (i-1)/n (regular design), with f ∈ Θ(β,Q) (Sobolev ellipsoid) and ε ~ subG_n(σ²), σ² ≤ 1. Let M = ⌈n^{1/(2β+1)}⌉ and the LS estimator θ̂^LS minimize the sum of squares over the first M trigonometric basis functions.

Then with probability ≥ 1 - δ, for n large enough: ‖φ_{θ̂^LS} - f‖²_{L₂([0,1])} ≤ C · n^{-2β/(2β+1)} + C · σ² · log(1/δ) / n

where C depends on β, Q, σ.

The proof combines:

The oracle inequality for LS (Theorem 3.3 + Lemma 3.13 orthogonality)
The approximation error bound (Lemma 3.14): ‖f - φ_{θ*}‖² ≤ C·M^{-2β}
The choice M = ⌈n^{1/(2β+1)}⌉ which balances approximation and estimation
The condition β ≥ (1+√5)/4 which ensures n^{1-2β} ≤ n^{-2β/(2β+1)}

source

theorem Chapter3.expectation_from_exponential_tail_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {X : Ω → ℝ} (hX_meas : Measurable X) (_hX_nn : ∀ (ω : Ω), 0 ≤ X ω) {A B : ℝ} (_hA : 0 ≤ A) (hB : 0 < B) (htail : ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | X ω ≤ A + B * Real.log (1 / δ)} ≥ ENNReal.ofReal (1 - δ)) :

∫ (ω : Ω), X ω ∂μ ≤ A + B

Standard tail integration lemma (proved via layer cake / Cavalieri).

For a nonneg measurable random variable X on a probability space, if for all δ ∈ (0,1] we have P(X ≤ A + B·log(1/δ)) ≥ 1-δ, then E[X] ≤ A + B.

Proof technique: E[X] = ∫₀^∞ P(X > t) dt (layer cake / Fubini) For t > A: set δ = exp(-(t-A)/B), then δ ∈ (0,1] and P(X > t) = P(X > A + B·log(1/δ)) ≤ 1 - (1-δ) = δ = exp(-(t-A)/B) So E[X] ≤ A + ∫_A^∞ exp(-(t-A)/B) dt = A + B.

This is a standard measure theory result (cf. Billingsley, Probability and Measure). The book (line 2508) says: "The bound in expectation can be obtained by integrating the tail bound" without giving the full proof.

source

theorem Chapter3.thm_3_15_uniform_high_prob {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (hβ_lower : β ≥ (1 + √5) / 4) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (hM_choice : ↑n ^ (1 / (2 * β + 1)) ≤ ↑M ∧ ↑M ≤ ↑n ^ (1 / (2 * β + 1)) + 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

∃ (C : ℝ), 0 < C ∧ ∀ (δ : ℝ), 0 < δ → δ ≤ 1 → μ {ω : Ω | (L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) ≤ C * ↑n ^ (-(2 * β) / (2 * β + 1)) + C * σ ^ 2 * Real.log (1 / δ) / ↑n} ≥ ENNReal.ofReal (1 - δ)

Uniform high probability bound for the LS estimator.

This restates the high probability bound from thm_3_15_high_prob in a form where the constant C is extracted uniformly for all δ ∈ (0,1]. The mathematical content is the same as oracle_inequality_L2_form + approx_error_L2_form + thm_3_15_rate_computation, but the constant is stated to work for all δ simultaneously (which is true in the mathematical proof since the oracle inequality constant doesn't depend on δ).

Book lines 2496–2508: the proof of Theorem 3.15 establishes bounds with constants depending only on β, Q, σ (not on δ).

source

theorem Chapter3.thm_3_15_expectation_from_tail {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (hβ_lower : β ≥ (1 + √5) / 4) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (hM_choice : ↑n ^ (1 / (2 * β + 1)) ≤ ↑M ∧ ↑M ≤ ↑n ^ (1 / (2 * β + 1)) + 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) (hMeas : Measurable fun (ω : Ω) => L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

∃ (C : ℝ), 0 < C ∧ ∫ (ω : Ω), L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x ∂μ ≤ C * ↑n ^ (-(2 * β) / (2 * β + 1))

Auxiliary lemma for Theorem 3.15: Tail integration technique.

From the high probability bound (thm_3_15_uniform_high_prob), which holds for all δ ∈ (0,1] with a uniform constant C, we can integrate the tail bound to obtain an expectation bound. The standard technique uses E[X] = ∫₀^∞ P(X > t) dt and a change of variables δ = exp(-n(t - C·rate)/(C·σ²)), yielding E[X] ≤ C'·n^{-2β/(2β+1)}. Book line 2508: "The bound in expectation can be obtained by integrating the tail bound."

This proof combines:

The uniform high probability bound thm_3_15_uniform_high_prob
The standard tail integration technique expectation_from_exponential_tail_bound
The fact that σ²/n ≤ n^{-2β/(2β+1)} (since σ² ≤ 1 and 1/n ≤ rate)

source

theorem Chapter3.thm_3_15_expectation {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (hβ_lower : β ≥ (1 + √5) / 4) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (hM_choice : ↑n ^ (1 / (2 * β + 1)) ≤ ↑M ∧ ↑M ≤ ↑n ^ (1 / (2 * β + 1)) + 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) (hMeas : Measurable fun (ω : Ω) => L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

∃ (C : ℝ), 0 < C ∧ ∫ (ω : Ω), L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x ∂μ ≤ C * ↑n ^ (-(2 * β) / (2 * β + 1))

Theorem 3.15 (Expectation bound for LS on Sobolev class).

Under the same assumptions as the high probability bound: E[‖φ_{θ̂^LS} - f‖²_{L₂([0,1])}] ≤ C · n^{-2β/(2β+1)}

Obtained by integrating the tail bound from the high probability result.

source

theorem Chapter3.thm_3_15 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {n : ℕ} (hn : 0 < n) (β : ℝ) (hβ_pos : 0 < β) (hβ_lower : β ≥ (1 + √5) / 4) (Q : ℝ) (hQ : 0 < Q) (f : ℝ → ℝ) (θstar : ℕ → ℝ) (hθstar : ∀ (j : ℕ), θstar j = fourierCoeff f j) (hSobolev : ∀ (K : ℕ), ∑ j : Fin K, sobolevCoeff β (↑j + 1) ^ 2 * θstar ↑j ^ 2 ≤ Q) (σ : ℝ) (hσ : 0 < σ) (hσ_le : σ ^ 2 ≤ 1) (ε : Ω → Fin n → ℝ) (hsubG : IsSubGaussianNoiseVec ε (σ ^ 2) μ) (Y : Ω → Fin n → ℝ) (hModel : ∀ (ω : Ω) (i : Fin n), Y ω i = f (↑↑i / ↑n) + ε ω i) (M : ℕ) (hM_pos : 0 < M) (hM_le : M ≤ n - 1) (hM_choice : ↑n ^ (1 / (2 * β + 1)) ≤ ↑M ∧ ↑M ≤ ↑n ^ (1 / (2 * β + 1)) + 1) (θhat : Ω → Fin M → ℝ) (hLS : IsLSEstimator Y θhat) (hθstar_emp : ∀ (j : Fin M), θstar ↑j = 1 / ↑n * ∑ i : Fin n, f (↑↑i / ↑n) * trigBasis (↑j + 1) (↑↑i / ↑n)) (δ : ℝ) (hδ_pos : 0 < δ) (hδ_le : δ ≤ 1) (hMeas : Measurable fun (ω : Ω) => L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) (hf_meas : Measurable f) (hf_sq_int : MeasureTheory.IntegrableOn (fun (x : ℝ) => f x ^ 2) (Set.Icc 0 1) MeasureTheory.volume) :

(∃ (C : ℝ), 0 < C ∧ μ {ω : Ω | (L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x) ≤ C * ↑n ^ (-(2 * β) / (2 * β + 1)) + C * σ ^ 2 * Real.log (1 / δ) / ↑n} ≥ ENNReal.ofReal (1 - δ)) ∧ ∃ (C : ℝ), 0 < C ∧ ∫ (ω : Ω), L2normSq fun (x : ℝ) => trigLinComb M (θhat ω) x - f x ∂μ ≤ C * ↑n ^ (-(2 * β) / (2 * β + 1))

Theorem 3.15 (Combined statement: High probability + Expectation bounds for LS on Sobolev class).

Book lines 2479–2486. With probability at least 1 - δ: ‖φ_{θ̂^LS} - f‖²_{L₂([0,1])} ≤ C · n^{-2β/(2β+1)} + C · σ² · log(1/δ) / n

Moreover: E[‖φ_{θ̂^LS} - f‖²_{L₂([0,1])}] ≤ C · n^{-2β/(2β+1)}

This bundles both parts of the theorem as stated in the book.

Documentation