TorchLean API

NN.Floats.NeuralFloat.Core

NeuralFloat core (Flocq-style rounded arithmetic) #

TorchLean frequently reasons about floating-point behavior using a classical "rounded arithmetic on " approach rather than a bit-level IEEE-754 model:

This decomposition is the same one used by the Coq library Flocq. It makes many theorems reusable across formats (fixed-point, unbounded floats, bounded IEEE-like floats) and aligns well with ULP-style error bounds from numerical analysis.

TorchLean also cares about mixed precision training/inference (FP16/bfloat16/TF32/FP32/FP64). In this folder, "precision" is not a promise about bit-level encoding; it's a parameter that selects mantissa/exponent sizes and is used by the format and error-bound layers.

For executable, bit-level IEEE-754 semantics (NaN/Inf/signed zero), see NN/Floats/IEEEExec/.

References #

Radix (base) for "floating-point-like" representations.

In practice we almost always use base 2 (binary_radix) because that's what hardware implements, but keeping the base explicit helps make the model match the literature (and it keeps some proofs parametric in the radix).

  • base :

    Radix base (e.g. 2 for binary).

  • base_valid : 2 self.base

    Validity condition: the base is at least 2.

Instances For

    Standard binary radix (β = 2).

    Instances For

      Decimal radix (β = 10, useful for small examples and exact decimal inputs).

      Instances For

        Coerce the radix base to (used by bpow and logarithms).

        Instances For

          The radix base is positive when viewed as a real number.

          The radix base is nonzero when viewed as a real number.

          The radix base is strictly greater than 1 (for a valid radix, 2 ≤ base).

          An abstract floating-point value (mantissa/exponent) plus analysis metadata.

          The core mathematical payload is mantissa and exponent. The other fields are there to support mixed precision and simple error-tracking experiments used in some TorchLean demos:

          • precision: a named format (FP16/FP32/…); see NeuralPrecision.
          • error_bound: a conservative absolute error bound attached by a conversion/rounding pass.
          • phase: forward/backward/update/inference (see TrainingPhase.requires_high_precision).
          • layer_id/batch_id: lightweight tags used by “track where an error came from” utilities.
          • mantissa :

            Integer mantissa m.

          • exponent :

            Integer exponent e.

          • precision : NeuralPrecision

            Named precision metadata (e.g. FP16/FP32); does not affect to_real.

          • error_bound :

            Conservative absolute error bound metadata attached by conversions/rounding passes.

          • Training phase metadata (forward/backward/update/inference).

          • layer_id :

            Optional layer id tag (used in demos/experiments).

          • batch_id :

            Optional batch id tag (used in demos/experiments).

          Instances For

            Structural zero test (mantissa is exactly 0).

            Instances For

              Sign of the mantissa (matches the sign of to_real since β^e > 0).

              Instances For
                noncomputable def TorchLean.Floats.neuralBpow (β : NeuralRadix) (e : ) :

                Base power: β^e as a real number.

                This is Flocq's bpow concept: the scaling factor used to interpret mantissa/exponent pairs.

                Instances For

                  Base powers are positive: β^e > 0 for any exponent e.

                  Base powers are nonnegative: β^e ≥ 0 for any exponent e.

                  Base powers are never zero.

                  theorem TorchLean.Floats.neuralBpow.add_exp (β : NeuralRadix) (e1 e2 : ) :
                  neuralBpow β (e1 + e2) = neuralBpow β e1 * neuralBpow β e2

                  Exponent addition law: β^(e1+e2) = β^e1 * β^e2.

                  Negating the exponent inverts the base power: β^(-e) = (β^e)⁻¹.

                  noncomputable def TorchLean.Floats.neuralToReal {β : NeuralRadix} (f : NeuralFloat β) :

                  Interpret a NeuralFloat as a real number: m * β^e.

                  Instances For
                    @[simp]

                    to_real = 0 iff the mantissa is 0 (since β^e ≠ 0).

                    noncomputable def TorchLean.Floats.neuralMagnitude (β : NeuralRadix) (x : ) :

                    Magnitude (base-β) of a real number.

                    This matches the usual definition mag(x) = ⌊log_β(|x|)⌋ + 1 for x ≠ 0, and 0 for x = 0. It is the bridge between a real input x and the exponent-selection function fexp.

                    Instances For

                      Validity predicate for exponent-selection functions.

                      In Flocq, many results are stated for an abstract fexp : ℤ → ℤ satisfying Valid_exp. We keep essentially the same interface (see flocq_valid) so theorems can be stated using direct analogues of Flocq results, and we add a couple of convenience properties that are often needed for bounding arguments (bounded_growth, monotone).

                      • flocq_valid (k : ) : (fexp k < kfexp (k + 1) k) (k fexp kfexp (fexp k + 1) fexp k lfexp k, fexp l = fexp k)
                      • bounded_growth (k : ) : |fexp (k + 1) - fexp k| 1
                      • monotone (k1 k2 : ) : k1 k2fexp k1 fexp k2
                      Instances
                        noncomputable def TorchLean.Floats.neuralCexp (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) :

                        Canonical exponent (cexp in Flocq terminology).

                        Given a nonzero x, we first compute its magnitude mag(x) and then apply fexp to pick the exponent used for scaling/rounding.

                        Instances For
                          noncomputable def TorchLean.Floats.neuralScaledMantissa (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) :

                          Scaled mantissa (x * β^{-cexp(x)}).

                          Intuitively: rescale x so that rounding “happens around exponent 0”, which is where rnd acts.

                          Instances For

                            Generic format predicate (Flocq-style).

                            This says: x is exactly representable in the format picked out by β and fexp. One way to read it is: the scaled mantissa is an integer (so there is no rounding error).

                            Instances For
                              noncomputable def TorchLean.Floats.neuralUlp (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (phase : TrainingPhase := TrainingPhase.forward) :

                              Unit in the last place (ulp) associated with x.

                              This is the scale of the “one ulp” step at the exponent selected by cexp. For round-to-nearest, many standard bounds have the shape |round(x) - x| ≤ ulp(x)/2.

                              TorchLean adds a small, optional twist: during numerically sensitive phases (see TrainingPhase.requires_high_precision) we treat the bound as if it were one extra bit tighter. This is a modeling hook for mixed-precision heuristics; it is not a replacement for a concrete bit-level IEEE-754 semantics.

                              Instances For
                                theorem TorchLean.Floats.neuralUlp.nonneg (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (phase : TrainingPhase) :
                                0 neuralUlp β fexp x phase

                                neural_ulp is always nonnegative.

                                Informally: an ulp is a step size on a real grid, so it cannot be negative.

                                theorem TorchLean.Floats.neuralUlp.pos_of_ne_zero (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (hx : x 0) (phase : TrainingPhase) :
                                0 < neuralUlp β fexp x phase

                                neural_ulp is strictly positive away from zero.

                                Informally: if x ≠ 0 then the exponent selection cexp(x) picks some power of β, and the ulp at that exponent is β^{cexp(x)} (or half of it in high-precision phases), hence positive.

                                @[simp]
                                theorem TorchLean.Floats.neuralUlp.zero (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (phase : TrainingPhase) :
                                neuralUlp β fexp 0 phase = neuralBpow β (fexp 0)

                                neural_ulp at zero does not depend on the training phase.

                                theorem TorchLean.Floats.neuralUlp.eq_base_of_ne_zero_of_not_high_precision (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (hx : x 0) (phase : TrainingPhase) (hphase : phase.requiresHighPrecision = false) :
                                neuralUlp β fexp x phase = neuralBpow β (neuralCexp β fexp x)

                                When x ≠ 0 and the phase does not request high precision, neural_ulp is just the base grid step β^{cexp(x)}.

                                theorem TorchLean.Floats.neuralUlp.eq_base_div_two_of_ne_zero_of_high_precision (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (hx : x 0) (phase : TrainingPhase) (hphase : phase.requiresHighPrecision = true) :
                                neuralUlp β fexp x phase = neuralBpow β (neuralCexp β fexp x) / 2

                                When x ≠ 0 and the phase requests high precision, we use the same exponent scale but treat the ULP as “one extra bit tighter” by dividing by 2.

                                @[simp]
                                theorem TorchLean.Floats.neuralUlp.forward_of_ne_zero (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (hx : x 0) :
                                neuralUlp β fexp x = neuralBpow β (neuralCexp β fexp x)

                                Forward-mode ULP simplification for x ≠ 0.

                                This matches the common “one ulp at exponent cexp(x)” intuition used in numerical analysis and in everyday PyTorch/IEEE-754 error reasoning.

                                @[simp]

                                Inference-phase ULP simplification for x ≠ 0 (same scale as forward).

                                @[simp]
                                theorem TorchLean.Floats.neuralUlp.backward_of_ne_zero (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (hx : x 0) :

                                Backward-phase ULP simplification for x ≠ 0 (uses the “tighter by 2” convention).

                                @[simp]
                                theorem TorchLean.Floats.neuralUlp.update_of_ne_zero (β : NeuralRadix) (fexp : ) [NeuralValidExp fexp] (x : ) (hx : x 0) :

                                Update-phase ULP simplification for x ≠ 0 (uses the “tighter by 2” convention).