Documentation

Atlas.HighDimensionalStatistics.code.Chapter4.Thm_4_8

Local definitions (inlined to avoid cross-file import chains) #

def Thm48.IsSpikedCovariance {d : } (S : Matrix (Fin d) (Fin d) ) (θ : ) (v : EuclideanSpace (Fin d)) :

Definition 4.7 (Spiked covariance model). A matrix S satisfies the spiked covariance model with parameter θ > 0 and spike v (a unit vector) if S = θ • (v · vᵀ) + I_d.

Instances For
    noncomputable def Thm48.principalAngle {d : } (u v : EuclideanSpace (Fin d)) :

    The principal angle between two vectors u and v is arccos |⟨u, v⟩|.

    Instances For
      noncomputable def Thm48.matrixOpNorm {d : } (A : Matrix (Fin d) (Fin d) ) :

      The operator norm of a matrix: ‖A‖_op = sup_{‖x‖=1} ‖Ax‖, defined via the continuous linear map norm on Euclidean spaces.

      Instances For

        IsLargestEigenvector M w means w is a unit eigenvector of M corresponding to the largest eigenvalue: it has unit norm and maximizes the Rayleigh quotient wᵀ M w over all unit vectors.

        Instances For

          A largest eigenvector has unit norm.

          Theorem 4.8 (Davis-Kahan sin(θ) theorem) #

          The theorem states that for the spiked covariance model Σ = θ v vᵀ + I_d and any PSD estimator Σ̃ with largest eigenvector :

          min_{ε ∈ {±1}} ‖ε·ṽ − v‖₂² ≤ 2 sin²(∠(ṽ, v)) ≤ (8/θ²) ‖Σ̃ − Σ‖_op²

          theorem Thm48.theorem_4_8_left {d : } (vtilde v : EuclideanSpace (Fin d)) (hvt : vtilde = 1) (hv : v = 1) :
          min (vtilde - v ^ 2) (vtilde + v ^ 2) 2 * Real.sin (principalAngle vtilde v) ^ 2

          Theorem 4.8, left inequality. For any two unit vectors, the minimum sign-adjusted squared distance is at most 2 sin² of the principal angle: min(‖ṽ − v‖², ‖ṽ + v‖²) ≤ 2 sin²(∠(ṽ, v)).

          Helper lemmas for operator norm and Rayleigh quotients #

          theorem Thm48.smul_mulVec_distrib {d : } (θ : ) (M : Matrix (Fin d) (Fin d) ) (u : Fin d) :
          (θ M).mulVec u = θ M.mulVec u

          Scalar-matrix product applied to mulVec distributes.

          theorem Thm48.vecMulVec_self_mulVec {d : } (v x : Fin d) :

          vecMulVec v v applied to x gives (v ⬝ᵥ x) • v.

          theorem Thm48.dotProduct_vecMulVec_self {d : } (v x : Fin d) :

          Quadratic form of vecMulVec v v equals the square of the dot product.

          theorem Thm48.rayleigh_spiked_model {d : } (θ : ) (v u : Fin d) :
          u ⬝ᵥ (θ Matrix.vecMulVec v v + 1).mulVec u = θ * (v ⬝ᵥ u) ^ 2 + u ⬝ᵥ u

          Rayleigh quotient of the spiked covariance θ • vvᵀ + I is θ(v·u)² + ‖u‖².

          For unit vectors in EuclideanSpace ℝ, the self dot product equals 1.

          Inner product on EuclideanSpace ℝ equals dot product of coordinates.

          Connection: toEuclideanLin A x applied to ofLp gives A.mulVec (x.ofLp).

          Inner product ⟨toEuclideanLin A x, x⟩ equals the quadratic form x·(Ax).

          The absolute value of a quadratic form is bounded by the operator norm for unit vectors.

          theorem Thm48.symmetric_dotProduct_mulVec {d : } (M : Matrix (Fin d) (Fin d) ) (hM : M.transpose = M) (u w : Fin d) :

          For symmetric M (Mᵀ = M), the bilinear form is symmetric: u ⬝ᵥ (M v) = v ⬝ᵥ (M u).

          theorem Thm48.bilinear_identity {d : } (M : Matrix (Fin d) (Fin d) ) (hM : M.transpose = M) (u w : Fin d) :
          u ⬝ᵥ M.mulVec u - w ⬝ᵥ M.mulVec w = (u - w) ⬝ᵥ M.mulVec (u + w)

          Bilinear identity for symmetric M: u·(Mu) - w·(Mw) = (u-w)·(M(u+w)).

          Inner product ⟨u, toEuclideanLin A w⟩ equals the bilinear form u·(Aw).

          The Cauchy-Schwarz bound for bilinear forms with operator norm: |u ⬝ᵥ (A v)| ≤ ‖u‖ * ‖A‖_op * ‖v‖.

          The spiked covariance matrix is symmetric.

          theorem Thm48.davis_kahan_sin_bound {d : } (Sig : Matrix (Fin d) (Fin d) ) (θ : ) (v : EuclideanSpace (Fin d)) (hSpike : IsSpikedCovariance Sig θ v) (SigTilde : Matrix (Fin d) (Fin d) ) (hPSD : SigTilde.PosSemidef) (vtilde : EuclideanSpace (Fin d)) (hvt : vtilde = 1) (hEig : IsLargestEigenvector SigTilde vtilde) :
          Real.sin (principalAngle vtilde v) 2 / θ * matrixOpNorm (SigTilde - Sig)

          Key Davis-Kahan inequality: sin(angle) le (2/theta) opnorm(SigTilde - Sig).

          theorem Thm48.theorem_4_8_right {d : } (Sig : Matrix (Fin d) (Fin d) ) (θ : ) (v : EuclideanSpace (Fin d)) (hSpike : IsSpikedCovariance Sig θ v) (SigTilde : Matrix (Fin d) (Fin d) ) (hPSD : SigTilde.PosSemidef) (vtilde : EuclideanSpace (Fin d)) (hvt : vtilde = 1) (hEig : IsLargestEigenvector SigTilde vtilde) :
          2 * Real.sin (principalAngle vtilde v) ^ 2 8 / θ ^ 2 * matrixOpNorm (SigTilde - Sig) ^ 2
          theorem Thm48.theorem_4_8 {d : } (Sig : Matrix (Fin d) (Fin d) ) (θ : ) (v : EuclideanSpace (Fin d)) (hSpike : IsSpikedCovariance Sig θ v) (SigTilde : Matrix (Fin d) (Fin d) ) (hPSD : SigTilde.PosSemidef) (vtilde : EuclideanSpace (Fin d)) (hvt : vtilde = 1) (hEig : IsLargestEigenvector SigTilde vtilde) :
          min (vtilde - v ^ 2) (vtilde + v ^ 2) 8 / θ ^ 2 * matrixOpNorm (SigTilde - Sig) ^ 2

          Theorem 4.8 (Davis-Kahan sin(θ) theorem), full chain. Combines both inequalities into the end-to-end bound: min_{ε ∈ {±1}} ‖ε·ṽ − v‖₂² ≤ (8/θ²) ‖Σ̃ − Σ‖_op².