Smooth + Strongly Convex ⇒ Strongly Monotone Gradient (Bridge Lemma) #
TorchLean's GD convergence theorems are stated at the operator level:
gisμ-strongly monotone, andgisL-Lipschitz.
To apply them to gradient descent on an objective f, we need to instantiate g = ∇f.
Mathlib's global definition Gradient.gradient f x is total (it returns 0 if the derivative does
not exist), so for optimization theory we typically assume differentiability and reason using the
standard first-order characterization of strong convexity.
This file provides the key local bridge lemma:
If f satisfies the first-order strong convexity inequality (using the gradient) then ∇f is
μ-strongly monotone in the sense needed by GDLinearConvergence.
What this file does now provide is a concrete bridge from mathlib's StrongConvexOn definition
to a first-order inequality, under a DifferentiableAt assumption at the base point x.
In other words, the chain we can use is:
StrongConvexOn univ μ f + DifferentiableAt ℝ f x
⇒ FirstOrderStrongConvex μ f (at x)
⇒ StrongMonotone μ (∇ f).
The remaining (separate) “smoothness” bridge for the Lipschitz-gradient assumption can be done
later via bounds on fderiv (mean value theorem / operator norm bounds) or by importing an
appropriate L-smoothness development.
First-order strong convexity with parameter μ, stated using the gradient.
This is the familiar inequality:
f y ≥ f x + ⟪∇f x, y - x⟫ + (μ/2)‖y-x‖².
It is a standard characterization of μ-strong convexity for differentiable functions on
Euclidean/Hilbert spaces.
Instances For
First-order strong convexity at a fixed base point x.
This is the same inequality as FirstOrderStrongConvex, but quantified only over y.
It is the natural statement you get from convex analysis under a DifferentiableAt hypothesis at
x.
Instances For
StrongConvexOn univ μ f plus differentiability at x implies the first-order strong convexity
inequality at x.
This is the standard convex-analysis “supporting hyperplane” argument applied to
g(z) = f(z) - (μ/2)‖z‖², which is convex when f is strongly convex.
FirstOrderStrongConvex implies the gradient is μ-strongly monotone.
This is the exact operator-side fact needed to use GD.step_norm_sq_le with g = ∇f.