NeuralFloat core (Flocq-style rounded arithmetic) #
TorchLean frequently reasons about floating-point behavior using a classical "rounded arithmetic on
ℝ" approach rather than a bit-level IEEE-754 model:
- represent a value as an integer mantissa
m : ℤand exponente : ℤ, - interpret it as a real number
m * β^e, - describe the format via an exponent-selection function
fexp : ℤ → ℤand a rounding operator.
This decomposition is the same one used by the Coq library Flocq. It makes many theorems reusable across formats (fixed-point, unbounded floats, bounded IEEE-like floats) and aligns well with ULP-style error bounds from numerical analysis.
TorchLean also cares about mixed precision training/inference (FP16/bfloat16/TF32/FP32/FP64). In this folder, "precision" is not a promise about bit-level encoding; it's a parameter that selects mantissa/exponent sizes and is used by the format and error-bound layers.
For executable, bit-level IEEE-754 semantics (NaN/Inf/signed zero), see NN/Floats/IEEEExec/.
References #
- Flocq project (documentation + sources): https://flocq.gitlabpages.inria.fr/flocq/
- S. Boldo, G. Melquiond, “Flocq: a unified Coq library for proving floating-point algorithms correct” (ARITH 2011), DOI: 10.1109/ARITH.2011.40
- IEEE Standard for Floating-Point Arithmetic (IEEE 754-2019)
- N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, 2002
Radix (base) for "floating-point-like" representations.
In practice we almost always use base 2 (binary_radix) because that's what hardware implements,
but keeping the base explicit helps make the model match the literature (and it keeps some proofs
parametric in the radix).
- base : ℕ
Radix base (e.g.
2for binary). Validity condition: the base is at least 2.
Instances For
Decimal radix (β = 10, useful for small examples and exact decimal inputs).
Instances For
Coerce the radix base to ℝ (used by bpow and logarithms).
Instances For
The radix base is positive when viewed as a real number.
The radix base is nonzero when viewed as a real number.
The radix base is strictly greater than 1 (for a valid radix, 2 ≤ base).
An abstract floating-point value (mantissa/exponent) plus analysis metadata.
The core mathematical payload is mantissa and exponent. The other fields are there to support
mixed precision and simple error-tracking experiments used in some TorchLean demos:
precision: a named format (FP16/FP32/…); seeNeuralPrecision.error_bound: a conservative absolute error bound attached by a conversion/rounding pass.phase: forward/backward/update/inference (seeTrainingPhase.requires_high_precision).layer_id/batch_id: lightweight tags used by “track where an error came from” utilities.
- mantissa : ℤ
Integer mantissa
m. - exponent : ℤ
Integer exponent
e. - precision : NeuralPrecision
Named precision metadata (e.g. FP16/FP32); does not affect
to_real. - error_bound : ℝ
Conservative absolute error bound metadata attached by conversions/rounding passes.
- phase : TrainingPhase
Training phase metadata (forward/backward/update/inference).
- layer_id : ℕ
Optional layer id tag (used in demos/experiments).
- batch_id : ℕ
Optional batch id tag (used in demos/experiments).
Instances For
Structural zero test (mantissa is exactly 0).
Instances For
Sign of the mantissa (matches the sign of to_real since β^e > 0).
Instances For
Base power: β^e as a real number.
This is Flocq's bpow concept: the scaling factor used to interpret mantissa/exponent pairs.
Instances For
Base powers are positive: β^e > 0 for any exponent e.
Base powers are nonnegative: β^e ≥ 0 for any exponent e.
Base powers are never zero.
Exponent addition law: β^(e1+e2) = β^e1 * β^e2.
Negating the exponent inverts the base power: β^(-e) = (β^e)⁻¹.
Interpret a NeuralFloat as a real number: m * β^e.
Instances For
to_real = 0 iff the mantissa is 0 (since β^e ≠ 0).
Magnitude (base-β) of a real number.
This matches the usual definition mag(x) = ⌊log_β(|x|)⌋ + 1 for x ≠ 0, and 0 for x = 0.
It is the bridge between a real input x and the exponent-selection function fexp.
Instances For
Validity predicate for exponent-selection functions.
In Flocq, many results are stated for an abstract fexp : ℤ → ℤ satisfying Valid_exp.
We keep essentially the same interface (see flocq_valid) so theorems can be stated using direct
analogues of Flocq results, and we add a couple of convenience properties that are often needed for
bounding arguments (bounded_growth, monotone).
Instances
Canonical exponent (cexp in Flocq terminology).
Given a nonzero x, we first compute its magnitude mag(x) and then apply fexp to pick the
exponent used for scaling/rounding.
Instances For
Scaled mantissa (x * β^{-cexp(x)}).
Intuitively: rescale x so that rounding “happens around exponent 0”, which is where rnd acts.
Instances For
Generic format predicate (Flocq-style).
This says: x is exactly representable in the format picked out by β and fexp.
One way to read it is: the scaled mantissa is an integer (so there is no rounding error).
Instances For
Unit in the last place (ulp) associated with x.
This is the scale of the “one ulp” step at the exponent selected by cexp. For round-to-nearest,
many standard bounds have the shape |round(x) - x| ≤ ulp(x)/2.
TorchLean adds a small, optional twist: during numerically sensitive phases (see
TrainingPhase.requires_high_precision) we treat the bound as if it were one extra bit tighter.
This is a modeling hook for mixed-precision heuristics; it is not a replacement for a concrete
bit-level IEEE-754 semantics.
Instances For
neural_ulp is always nonnegative.
Informally: an ulp is a step size on a real grid, so it cannot be negative.
neural_ulp is strictly positive away from zero.
Informally: if x ≠ 0 then the exponent selection cexp(x) picks some power of β, and the ulp
at that exponent is β^{cexp(x)} (or half of it in high-precision phases), hence positive.
neural_ulp at zero does not depend on the training phase.
When x ≠ 0 and the phase does not request high precision, neural_ulp is just the base grid
step β^{cexp(x)}.
When x ≠ 0 and the phase requests high precision, we use the same exponent scale but treat the
ULP as “one extra bit tighter” by dividing by 2.
Forward-mode ULP simplification for x ≠ 0.
This matches the common “one ulp at exponent cexp(x)” intuition used in numerical analysis and
in everyday PyTorch/IEEE-754 error reasoning.
Inference-phase ULP simplification for x ≠ 0 (same scale as forward).
Backward-phase ULP simplification for x ≠ 0 (uses the “tighter by 2” convention).
Update-phase ULP simplification for x ≠ 0 (uses the “tighter by 2” convention).