Context α: scalar interface for models + proofs #
TorchLean is designed to be scalar-polymorphic: the same model/layer definitions can be instantiated over many numeric backends:
Float(fast execution; trusted runtime semantics),TorchLean.Floats.IEEE754.IEEE32Exec(executable bit-level IEEE-754 binary32),- interval enclosures for verification (see
NN/Floats/Interval/*), ℝ(proof-level mathematics).
Why we designed it this way:
- We did not want separate "Float model code", "proof model code", and "verification model code" that slowly diverge and become inconsistent.
- In practice, we iterate across phases: execute a small model, state the proof-level contract, then run verification bounds. Rewriting each model for each phase is error-prone.
- A single scalar-polymorphic spec gives one source of truth for layers/models, while letting us swap numeric meaning by changing the scalar instance.
- This keeps cross-checking honest: if behavior changes between backends, the difference is visible at the scalar semantics layer, not hidden inside duplicated model definitions.
- The tradeoff is a slightly larger scalar interface (
Context α), but we accept that complexity to keep architecture-level duplication low and proofs/reuse high.
Some relevant literature we were following / taking inspiration from:
- Bezanson et al., "Julia: A Fresh Approach to Numerical Computing" (generic numeric code across many scalar types; performance via specialization): https://arxiv.org/abs/1411.1607
- Spitters and van der Weegen, "Type classes for mathematics in type theory" (typeclass-based algebraic interfaces for reusable formalization and instances): https://doi.org/10.1017/S0960129511000119
- Elliott, "The Simple Essence of Automatic Differentiation" (one abstract formulation specialized to multiple concrete semantics/representations): https://arxiv.org/abs/1804.00746
- Mirman et al., "The Fundamental Limits of Interval Arithmetic for Neural Networks" (why interval backends are useful and where they become conservative): https://arxiv.org/abs/2112.05235
Our Context α is the same engineering pattern in a Lean setting: one model/layer definition, many
scalar interpretations, and explicit tradeoffs about semantics.
To make this practical, we collect the numeric operations required by neural networks into a single typeclass:
Context α
This is intentionally broader than a standard algebraic structure: it bundles arithmetic, ordering, and common transcendental functions (exp/tanh/log/sqrt) used by activations and losses.
Notes #
- Many spec definitions assume
[Context α]so they can be re‑used at multiple dtypes. - For "paper theorems", the spec layer fixes
Spec.SpecScalar := ℝ(seeNN/Spec/Core/Scalar.lean). Context.decidable_gtis included so executable code can decide comparisons (e.g. ReLU / argmax).- For executable examples,
Context.gtBoolconvertsx > yinto a printableBool. - For interval arithmetic, we override some order/comparison behavior (see
namespace Intervalbelow).
Scalar transcendental functions used in activations and losses.
- exp : α → α
- tanh : α → α
- cosh : α → α
- sqrt : α → α
- abs : α → α
- log : α → α
- pi : α
- cos : α → α
- sin : α → α
- sinh : α → α
Instances
Friendly aliases: these keep the original names (MathFunctions, Numbers) and give more
descriptive names for scalar-facing code.
The full scalar interface required by spec‑level tensors and models.
- default : α
- one : α
- zero : α
- add : α → α → α
- sub : α → α → α
- mul : α → α → α
- div : α → α → α
- neg : α → α
- pow : α → α → α
- max : α → α → α
- min : α → α → α
- exp : α → α
- tanh : α → α
- cosh : α → α
- sqrt : α → α
- abs : α → α
- log : α → α
- pi : α
- cos : α → α
- sin : α → α
- sinh : α → α
- neg_point_five : α
- neg_one : α
- pointone : α
- pointfive : α
- two : α
- three : α
- four : α
- five : α
- ten : α
- log10 : α
- log10000 : α
- epsilon : α
- neg_thousand : α
- decidable_gt : DecidableRel fun (x1 x2 : α) => x1 > x2
Instances
Decide x > y as a Bool using the Context's decidable_gt.
Instances For
A Context includes a decidable > relation; expose it as a standard typeclass.
MathFunctions instance for Lean's Float (runtime-oriented backend).
MathFunctions instance for ℝ (proof backend, noncomputable).
Numbers literals for ℝ (proof backend, noncomputable).
Coerce naturals into Float using Float.ofNat.
A lightweight ℚ backend #
Context includes transcendental functions and real-valued exponentiation (Pow α α) because many
models (softmax, tanh, etc.) need them when instantiated over Float / ℝ / interval scalars.
For ℚ, most transcendental functions do not map rationals to rationals, so there is no canonical
exact interpretation. TorchLean therefore does not install the rational Context globally.
Purely algebraic tests can opt in explicitly with:
open scoped NN.Spec.RationalAlgebraic
Current policy:
Pow ℚ ℚ instance used for the lightweight rational backend.
Policy: support x^y only when y is an integer rational (y.den = 1); otherwise return 0.
The instance is scoped so it is unavailable unless the caller explicitly opens
NN.Spec.RationalAlgebraic.
Instances For
MathFunctions ℚ dictionary for the lightweight rational backend.
Only abs is meaningful; other transcendental functions are defined as 0 in this scoped backend
and should not be used for semantic claims. Keeping this scoped makes unsupported transcendental rational models fail at
elaboration unless a file deliberately opts into the algebraic-test backend.
Instances For
Full opt-in Context dictionary for exact rational algebraic fragments.
Instances For
Full Context instance for ℝ (proof backend, noncomputable).