TorchLean API

Docs Home Guide Examples Graphs

NN.Floats.NeuralFloat.Core

NeuralFloat core (Flocq-style rounded arithmetic) #

TorchLean frequently reasons about floating-point behavior using a classical "rounded arithmetic on ℝ" approach rather than a bit-level IEEE-754 model:

represent a value as an integer mantissa m : ℤ and exponent e : ℤ,
interpret it as a real number m * β^e,
describe the format via an exponent-selection function fexp : ℤ → ℤ and a rounding operator.

This decomposition is the same one used by the Coq library Flocq. It makes many theorems reusable across formats (fixed-point, unbounded floats, bounded IEEE-like floats) and aligns well with ULP-style error bounds from numerical analysis.

TorchLean also cares about mixed precision training/inference (FP16/bfloat16/TF32/FP32/FP64). In this folder, "precision" is not a promise about bit-level encoding; it's a parameter that selects mantissa/exponent sizes and is used by the format and error-bound layers.

For executable, bit-level IEEE-754 semantics (NaN/Inf/signed zero), see NN/Floats/IEEEExec/.

References #

Flocq project (documentation + sources): https://flocq.gitlabpages.inria.fr/flocq/
S. Boldo, G. Melquiond, “Flocq: a unified Coq library for proving floating-point algorithms correct” (ARITH 2011), DOI: 10.1109/ARITH.2011.40
IEEE Standard for Floating-Point Arithmetic (IEEE 754-2019)
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, 2002

structure TorchLean.Floats.NeuralRadix :

Radix (base) for "floating-point-like" representations.

In practice we almost always use base 2 (binary_radix) because that's what hardware implements, but keeping the base explicit helps make the model match the literature (and it keeps some proofs parametric in the radix).

base : ℕ
Radix base (e.g. 2 for binary).
base_valid : 2 ≤ self.base
Validity condition: the base is at least 2.

Instances For

def TorchLean.Floats.binaryRadix :

Standard binary radix (β = 2).

Instances For

def TorchLean.Floats.decimalRadix :

Decimal radix (β = 10, useful for small examples and exact decimal inputs).

Instances For

def TorchLean.Floats.NeuralRadix.toReal (r : NeuralRadix) :

Coerce the radix base to ℝ (used by bpow and logarithms).

Instances For

theorem TorchLean.Floats.NeuralRadix.pos (r : NeuralRadix) :

The radix base is positive when viewed as a real number.

theorem TorchLean.Floats.NeuralRadix.ne_zero (r : NeuralRadix) :

The radix base is nonzero when viewed as a real number.

theorem TorchLean.Floats.NeuralRadix.gt_one (r : NeuralRadix) :

The radix base is strictly greater than 1 (for a valid radix, 2 ≤ base).

structure TorchLean.Floats.NeuralFloat (β : NeuralRadix) :

An abstract floating-point value (mantissa/exponent) plus analysis metadata.

The core mathematical payload is mantissa and exponent. The other fields are there to support mixed precision and simple error-tracking experiments used in some TorchLean demos:

precision: a named format (FP16/FP32/…); see NeuralPrecision.
error_bound: a conservative absolute error bound attached by a conversion/rounding pass.
phase: forward/backward/update/inference (see TrainingPhase.requires_high_precision).
layer_id/batch_id: lightweight tags used by “track where an error came from” utilities.

mantissa : ℤ
Integer mantissa m.
exponent : ℤ
Integer exponent e.
precision : NeuralPrecision
Named precision metadata (e.g. FP16/FP32); does not affect to_real.
error_bound : ℝ
Conservative absolute error bound metadata attached by conversions/rounding passes.
phase : TrainingPhase
Training phase metadata (forward/backward/update/inference).
layer_id : ℕ
Optional layer id tag (used in demos/experiments).
batch_id : ℕ
Optional batch id tag (used in demos/experiments).

Instances For

def TorchLean.Floats.NeuralFloat.isZero {β : NeuralRadix} (f : NeuralFloat β) :

Structural zero test (mantissa is exactly 0).

Instances For

def TorchLean.Floats.NeuralFloat.sign {β : NeuralRadix} (f : NeuralFloat β) :

Sign of the mantissa (matches the sign of to_real since β^e > 0).

Instances For

noncomputable def TorchLean.Floats.neuralBpow (β : NeuralRadix) (e : ℤ) :

Base power: β^e as a real number.

This is Flocq's bpow concept: the scaling factor used to interpret mantissa/exponent pairs.

Instances For

theorem TorchLean.Floats.neuralBpow.pos (β : NeuralRadix) (e : ℤ) :

0 < neuralBpow β e

Base powers are positive: β^e > 0 for any exponent e.

theorem TorchLean.Floats.neuralBpow.nonneg (β : NeuralRadix) (e : ℤ) :

0 ≤ neuralBpow β e

Base powers are nonnegative: β^e ≥ 0 for any exponent e.

theorem TorchLean.Floats.neuralBpow.ne_zero (β : NeuralRadix) (e : ℤ) :

neuralBpow β e ≠ 0

Base powers are never zero.

theorem TorchLean.Floats.neuralBpow.add_exp (β : NeuralRadix) (e1 e2 : ℤ) :

neuralBpow β (e1 + e2) = neuralBpow β e1 * neuralBpow β e2

Exponent addition law: β^(e1+e2) = β^e1 * β^e2.

theorem TorchLean.Floats.neuralBpow.neg_exp (β : NeuralRadix) (e : ℤ) :

neuralBpow β (-e) = (neuralBpow β e)⁻¹

Negating the exponent inverts the base power: β^(-e) = (β^e)⁻¹.

noncomputable def TorchLean.Floats.neuralToReal {β : NeuralRadix} (f : NeuralFloat β) :

Interpret a NeuralFloat as a real number: m * β^e.

Instances For

@[simp]

theorem TorchLean.Floats.neuralToReal.zero_iff {β : NeuralRadix} (f : NeuralFloat β) :

neuralToReal f = 0 ↔ f.mantissa = 0

to_real = 0 iff the mantissa is 0 (since β^e ≠ 0).

noncomputable def TorchLean.Floats.neuralMagnitude (β : NeuralRadix) (x : ℝ) :

Magnitude (base-β) of a real number.

This matches the usual definition mag(x) = ⌊log_β(|x|)⌋ + 1 for x ≠ 0, and 0 for x = 0. It is the bridge between a real input x and the exponent-selection function fexp.

Instances For

class TorchLean.Floats.NeuralValidExp (fexp : ℤ → ℤ) :

Validity predicate for exponent-selection functions.

In Flocq, many results are stated for an abstract fexp : ℤ → ℤ satisfying Valid_exp. We keep essentially the same interface (see flocq_valid) so theorems can be stated using direct analogues of Flocq results, and we add a couple of convenience properties that are often needed for bounding arguments (bounded_growth, monotone).

flocq_valid (k : ℤ) : (fexp k < k → fexp (k + 1) ≤ k) ∧ (k ≤ fexp k → fexp (fexp k + 1) ≤ fexp k ∧ ∀ l ≤ fexp k, fexp l = fexp k)
bounded_growth (k : ℤ) : |fexp (k + 1) - fexp k| ≤ 1
monotone (k1 k2 : ℤ) : k1 ≤ k2 → fexp k1 ≤ fexp k2

Instances

noncomputable def TorchLean.Floats.neuralCexp (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) :

Canonical exponent (cexp in Flocq terminology).

Given a nonzero x, we first compute its magnitude mag(x) and then apply fexp to pick the exponent used for scaling/rounding.

Instances For

noncomputable def TorchLean.Floats.neuralScaledMantissa (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) :

Scaled mantissa (x * β^{-cexp(x)}).

Intuitively: rescale x so that rounding “happens around exponent 0”, which is where rnd acts.

Instances For

def TorchLean.Floats.neuralGenericFormat (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) :

Generic format predicate (Flocq-style).

This says: x is exactly representable in the format picked out by β and fexp. One way to read it is: the scaled mantissa is an integer (so there is no rounding error).

Instances For

noncomputable def TorchLean.Floats.neuralUlp (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (phase : TrainingPhase := TrainingPhase.forward) :

Unit in the last place (ulp) associated with x.

This is the scale of the “one ulp” step at the exponent selected by cexp. For round-to-nearest, many standard bounds have the shape |round(x) - x| ≤ ulp(x)/2.

TorchLean adds a small, optional twist: during numerically sensitive phases (see TrainingPhase.requires_high_precision) we treat the bound as if it were one extra bit tighter. This is a modeling hook for mixed-precision heuristics; it is not a replacement for a concrete bit-level IEEE-754 semantics.

Instances For

theorem TorchLean.Floats.neuralUlp.nonneg (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (phase : TrainingPhase) :

0 ≤ neuralUlp β fexp x phase

neural_ulp is always nonnegative.

Informally: an ulp is a step size on a real grid, so it cannot be negative.

theorem TorchLean.Floats.neuralUlp.pos_of_ne_zero (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) (phase : TrainingPhase) :

0 < neuralUlp β fexp x phase

neural_ulp is strictly positive away from zero.

Informally: if x ≠ 0 then the exponent selection cexp(x) picks some power of β, and the ulp at that exponent is β^{cexp(x)} (or half of it in high-precision phases), hence positive.

@[simp]

theorem TorchLean.Floats.neuralUlp.zero (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (phase : TrainingPhase) :

neuralUlp β fexp 0 phase = neuralBpow β (fexp 0)

neural_ulp at zero does not depend on the training phase.

theorem TorchLean.Floats.neuralUlp.eq_base_of_ne_zero_of_not_high_precision (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) (phase : TrainingPhase) (hphase : phase.requiresHighPrecision = false) :

neuralUlp β fexp x phase = neuralBpow β (neuralCexp β fexp x)

When x ≠ 0 and the phase does not request high precision, neural_ulp is just the base grid step β^{cexp(x)}.

theorem TorchLean.Floats.neuralUlp.eq_base_div_two_of_ne_zero_of_high_precision (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) (phase : TrainingPhase) (hphase : phase.requiresHighPrecision = true) :

neuralUlp β fexp x phase = neuralBpow β (neuralCexp β fexp x) / 2

When x ≠ 0 and the phase requests high precision, we use the same exponent scale but treat the ULP as “one extra bit tighter” by dividing by 2.

@[simp]

theorem TorchLean.Floats.neuralUlp.forward_of_ne_zero (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) :

neuralUlp β fexp x = neuralBpow β (neuralCexp β fexp x)

Forward-mode ULP simplification for x ≠ 0.

This matches the common “one ulp at exponent cexp(x)” intuition used in numerical analysis and in everyday PyTorch/IEEE-754 error reasoning.

@[simp]

theorem TorchLean.Floats.neuralUlp.inference_of_ne_zero (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) :

neuralUlp β fexp x TrainingPhase.inference = neuralBpow β (neuralCexp β fexp x)

Inference-phase ULP simplification for x ≠ 0 (same scale as forward).

@[simp]

theorem TorchLean.Floats.neuralUlp.backward_of_ne_zero (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) :

neuralUlp β fexp x TrainingPhase.backward = neuralBpow β (neuralCexp β fexp x) / 2

Backward-phase ULP simplification for x ≠ 0 (uses the “tighter by 2” convention).

@[simp]

theorem TorchLean.Floats.neuralUlp.update_of_ne_zero (β : NeuralRadix) (fexp : ℤ → ℤ) [NeuralValidExp fexp] (x : ℝ) (hx : x ≠ 0) :

neuralUlp β fexp x TrainingPhase.update = neuralBpow β (neuralCexp β fexp x) / 2

Update-phase ULP simplification for x ≠ 0 (uses the “tighter by 2” convention).