TorchLean API

NN.MLTheory.Optimization.GDLinearConvergence

Gradient Descent: Linear Convergence from Strong Monotonicity + Lipschitz Gradient #

This file is a small, “real” optimization-theory module:

It proves a linear convergence bound for the iteration

x_{k+1} = x_k - η g(x_k)

under two assumptions on g over a real inner product space:

  1. g is μ-strongly monotone: μ‖x-y‖^2 ≤ ⟪x-y, g x - g y⟫.
  2. g is L-Lipschitz: ‖g x - g y‖ ≤ L‖x-y‖.

For gradients of μ-strongly convex and L-smooth functions, these are the standard operator-level properties that imply linear convergence of gradient descent with a suitable step size.

This module deliberately avoids any heavy Fréchet-derivative setup: it is stated directly in terms of g so it can later be instantiated either by “g = ∇f” theorems or by verified gradients of concrete TorchLean models.

Strong monotonicity of an operator in a real inner product space.

Instances For
    def Optim.GD.step {E : Type} [NormedAddCommGroup E] [InnerProductSpace E] (η : ) (g : EE) (x : E) :
    E

    One gradient-descent-like step for an operator g.

    Instances For
      theorem Optim.GD.step_sub_step {E : Type} [NormedAddCommGroup E] [InnerProductSpace E] (η : ) (g : EE) (x y : E) :
      step η g x - step η g y = x - y - η (g x - g y)

      Expand the difference of two step applications.

      theorem Optim.GD.step_norm_sq_le {E : Type} [NormedAddCommGroup E] [InnerProductSpace E] (η μ : ) ( : 0 η) {L : NNReal} (g : EE) (hmono : StrongMonotone μ g) (hlip : LipschitzWith L g) (x y : E) :
      step η g x - step η g y ^ 2 (1 - 2 * η * μ + η ^ 2 * L ^ 2) * x - y ^ 2

      Key contraction-in-squared-norm inequality.

      If g is μ-strongly monotone and L-Lipschitz, then ‖step η g x - step η g y‖² ≤ q * ‖x - y‖² with q = 1 - 2ημ + η² L².