TorchLean API

NN.MLTheory.Optimization.SmoothStrongConvexBridge

Smooth + Strongly Convex ⇒ Strongly Monotone Gradient (Bridge Lemma) #

TorchLean's GD convergence theorems are stated at the operator level:

To apply them to gradient descent on an objective f, we need to instantiate g = ∇f.

Mathlib's global definition Gradient.gradient f x is total (it returns 0 if the derivative does not exist), so for optimization theory we typically assume differentiability and reason using the standard first-order characterization of strong convexity.

This file provides the key local bridge lemma:

If f satisfies the first-order strong convexity inequality (using the gradient) then ∇f is μ-strongly monotone in the sense needed by GDLinearConvergence.

What this file does now provide is a concrete bridge from mathlib's StrongConvexOn definition to a first-order inequality, under a DifferentiableAt assumption at the base point x.

In other words, the chain we can use is:

StrongConvexOn univ μ f + DifferentiableAt ℝ f xFirstOrderStrongConvex μ f (at x) ⇒ StrongMonotone μ (∇ f).

The remaining (separate) “smoothness” bridge for the Lipschitz-gradient assumption can be done later via bounds on fderiv (mean value theorem / operator norm bounds) or by importing an appropriate L-smoothness development.

First-order strong convexity with parameter μ, stated using the gradient.

This is the familiar inequality:

f y ≥ f x + ⟪∇f x, y - x⟫ + (μ/2)‖y-x‖².

It is a standard characterization of μ-strong convexity for differentiable functions on Euclidean/Hilbert spaces.

Instances For

    First-order strong convexity at a fixed base point x.

    This is the same inequality as FirstOrderStrongConvex, but quantified only over y. It is the natural statement you get from convex analysis under a DifferentiableAt hypothesis at x.

    Instances For

      StrongConvexOn univ μ f plus differentiability at x implies the first-order strong convexity inequality at x.

      This is the standard convex-analysis “supporting hyperplane” argument applied to g(z) = f(z) - (μ/2)‖z‖², which is convex when f is strongly convex.

      FirstOrderStrongConvex implies the gradient is μ-strongly monotone.

      This is the exact operator-side fact needed to use GD.step_norm_sq_le with g = ∇f.