5.1. Floating Point Semantics
TorchLean treats "float32" as a family of related meanings, not as one overloaded word. That distinction is essential for a formal ML codebase. A verifier may prove a real-valued margin, a runtime may execute binary32 arithmetic, CUDA may choose fused multiply-add or a particular reduction tree, and an imported certificate may describe yet another abstraction. Those objects are connected, but they are not identical.
The purpose of this page is to name the objects and the bridges between them.
5.1.1. The Central Example
Suppose a model computes a dot product:
\sum_i w_i x_i
Over the reals, this sum has one mathematical value. Over float32, the value also depends on rounding after each operation, whether multiplication and addition are fused, and the order in which the sum is evaluated. On a GPU, a parallel reduction may use a different tree from a CPU loop.
So a useful formal statement cannot simply say "the model runs in Float32." It should say:
-
which semantic scalar is used;
-
which reduction or operation order is used;
-
which finite/no-overflow side conditions are required;
-
and which theorem or assumption connects the executed value to the proof value.
That is the whole numeric discipline of this chapter.
5.1.2. Four Numeric Layers
TorchLean separates four layers:
-
Real specification:
ℝtensors and spec functions. This is the ideal mathematical model: networks, losses, and verifier inequalities before rounding. -
Rounded real proof model:
FP32/NF. This model performs real operations and rounds to the binary32 grid, which is the right setting for compositional error proofs. -
Executable IEEE-style bits:
IEEE32Exec. This model stores rawUInt32binary32 values and includes signed zeros, infinities, NaNs, comparisons, and special rules. -
Native/runtime execution: Lean
Float32, CUDA, cuBLAS, cuFFT, and external tools. These are fast producers of values whose agreement with Lean semantics is an explicit contract or certificate boundary.
The first three layers are Lean definitions. The fourth layer is a named trust boundary. This is why TorchLean can state both practical and formal claims: we can run real examples, but the theorem still says exactly which semantic object it concerns.
5.1.3. Why Both FP32 And IEEE32Exec Exist
FP32 and IEEE32Exec answer different questions.
FP32 is the proof model. It is a rounded-real scalar: compute the real operation, round to the
binary32 format, and keep an explicit error bound. This is the right level for layerwise forward
error, verifier-margin transfer, and paper statements such as "the rounded execution stays within
ε of the real specification."
IEEE32Exec is the executable bit model. It stores raw binary32 bits and implements IEEE-style
behavior for the core operations, including signed zero, infinities, NaNs, comparisons, and
special-value propagation. This is the right level for widgets, demos, edge-case tests, and checking
what a binary32-shaped computation actually does.
The bridge theorems connect the two on the finite path. That means the inputs and result decode to
ordinary finite real values, and the operation-specific side conditions are satisfied. When a
computation leaves that finite path, the total theorems use Option: finite values become
some r, while NaN and infinity become none.
5.1.4. API Map
These imports are the main landmarks:
import NN.Floats.Float32 import NN.Floats.FP32 import NN.Floats.IEEEExec import NN.Floats.IEEEExec.BridgeFP32 import NN.Floats.IEEEExec.BridgeFP32Total import NN.Floats.IEEEExec.BridgeInitFloat32 open TorchLean.Floats open TorchLean.Floats.IEEE754 -- User-facing selector. #check TorchLean.Floats.Float32Mode #check TorchLean.Floats.F32 -- Proof model: finite rounded-real binary32 arithmetic. #check TorchLean.Floats.FP32 #check TorchLean.Floats.FP32.toReal -- Executable bit-level model: raw UInt32 binary32 values. #check IEEE32Exec #check IEEE32Exec.ofBits #check IEEE32Exec.toBits #check IEEE32Exec.toReal #check IEEE32Exec.toReal? -- Bridges from executable bits to proof semantics. #check IEEE32Exec.toReal_add_eq_fp32Round #check IEEE32Exec.toReal?_add_eq_ite -- Runtime Float32 remains a named assumption boundary. #check Float32Bridge.RuntimeFloat32MatchesIEEE32Exec
Read these names as the dependency graph. F32 selects the scalar. FP32 is the proof-side scalar.
IEEE32Exec is the executable scalar. BridgeFP32 and BridgeFP32Total say how executable bits
become proof-side rounded reals. BridgeInitFloat32 names what remains if a claim uses Lean's
runtime Float32.
5.1.5. A Single Addition At Four Levels
The same expression, a + b, has four different readings:
-
a + b : ℝis exact mathematical addition. -
FP32.add a bis exact real addition rounded to the binary32 grid. -
IEEE32Exec.add x yis executable binary32 bit arithmetic with IEEE-style special cases. -
a runtime or CUDA add is an external implementation whose agreement is a contract or test boundary.
The finite bridge theorem says the IEEE32Exec result agrees with the FP32 rounded-real result
when the finite hypotheses hold:
import NN.Floats.IEEEExec.BridgeFP32 import NN.Floats.IEEEExec.BridgeFP32Total open TorchLean.Floats.IEEE754 -- Finite bridge. #check IEEE32Exec.toReal_add_eq_fp32Round -- Total bridge. #check IEEE32Exec.toReal?_add_eq_ite
That is the pattern for the rest of the chapter. Do not collapse the layers; connect them with a named theorem or a named boundary.
5.1.6. Reductions Need An Order
Real addition is associative. Floating-point addition is not. Therefore a theorem about a float32 sum, dot product, convolution accumulation, attention score, or CUDA reduction needs an order contract.
TorchLean states reduction facts around a finite evaluation tree rather than pretending every parallel schedule is the same:
The reductions API contains the long-form theorem names for sum trees and dot products. The important reading rule is shorter than the names: soundness is stated for a specified finite evaluation tree, not for every possible reordering.
This is the numerical reason CUDA reductions are discussed in the native-boundary chapter. A fast kernel may be a perfectly good training kernel, but a proof-quality statement needs a fixed tree or an explicit reduction specification.
5.1.7. Finite Path, Precisely
The finite path is the part of IEEE-style execution where the result can be read as an ordinary real number.
Examples:
-
a normal binary32 value is finite;
-
a subnormal binary32 value is still finite, although with reduced precision;
-
+0.0and-0.0are finite bit patterns, but proofs may need to know which operation produced them; -
overflow to
+Infor-Infleaves the finite path; -
0/0,Inf - Inf, andsqrtof a negative finite value produce NaN paths, not finite paths; -
a reduction is finite only relative to a specified evaluation tree whose intermediate results remain finite.
The bridge files reflect this split:
-
BridgeFP32 API proves finite refinements such as
toReal_add_eq_fp32Round,toReal_mul_eq_fp32Round,toReal_div_eq_fp32Round,toReal_fma_eq_fp32Round, andtoReal_sqrt_eq_fp32Round. -
BridgeFP32Total API packages total statements through
toReal?, with theorem names such astoReal?_add_eq_ite,toReal?_mul_eq_ite,toReal?_div_eq_ite, andtoReal?_sqrt_eq_ite. -
SpecialRules API records NaN, infinity, signed-zero, and special-case behavior for the executable kernel itself.
The informal theorem shape is:
-
execute the operation in
IEEE32Exec; -
decode the result with
toReal; -
obtain the same real value as applying the corresponding real operation to the decoded inputs and rounding that result with
FP32.fp32Round.
This reading is valid under the stated finite and operation specific hypotheses.
5.1.8. Transcendentals And Library Boundaries
The proof-side FP32 model defines transcendentals by applying the corresponding real function and
then rounding. Theorems such as exp_abs_error, tanh_abs_error, and interval membership lemmas
are therefore statements relative to Lean's real functions.
The executable and native stacks are different. A CPU libm, CUDA libdevice, or vendor library
may provide an approximation with documented accuracy, but that does not automatically become a
Lean theorem. TorchLean therefore keeps transcendental claims explicit:
-
use
FP32/NFfor proof-side rounded-real statements; -
use
IEEE32Execfor deterministic executable behavior where the operation is defined in Lean; -
use a runtime or CUDA contract when an external library computes the value.
This is the same lesson emphasized by the floating-point verification literature: compiler, library, and hardware choices are part of the semantics unless they are isolated behind a proof or contract.
5.1.9. Runtime And CUDA Boundaries
Lean's runtime Float32 operations are external runtime calls. Lean documents Float operations as
IEEE-style opaque operations that do not reduce in the kernel and compile to C operators; the same
general concern applies here: an external operation can be used, tested, and bounded by an
assumption, but it is not automatically a proof object.
TorchLean names that assumption surface:
import NN.Floats.IEEEExec.BridgeInitFloat32 open TorchLean.Floats.IEEE754 open TorchLean.Floats.IEEE754.Float32Bridge #check RuntimeFloat32MatchesIEEE32Exec #check RuntimeFloat32MatchesIEEE32Exec.toIEEE32Exec_add #check RuntimeFloat32MatchesIEEE32Exec.toIEEE32Exec_mul #check RuntimeFloat32MatchesIEEE32Exec.toIEEE32Exec_sqrt
The class says, operation by operation, that runtime float32 bits match the IEEE32Exec reference
bits. Once that bridge is supplied, downstream proofs can reuse the IEEE32Exec and FP32
theorems. Without it, a runtime result is evidence produced by an implementation, not a theorem
about the Lean-defined scalar.
CUDA uses the same idea at the native boundary:
import NN.Runtime.Autograd.Engine.Cuda.Float32Contract import NN.Runtime.Autograd.Engine.Cuda.KernelSpec open Runtime.Autograd.Cuda.Float32Contract #check NativePrimitiveAgreement #check native_add_abs_error_of_isFinite #check native_mul_abs_error_of_isFinite #check native_div_abs_error_of_isFinite #check native_fma_abs_error_of_isFinite #check native_sqrt_abs_error_of_isFinite
For the engineering details, read GPU and CUDA Boundaries. This page only fixes the scalar meaning: native arithmetic is connected to proofs through explicit bit-agreement contracts, fixed reduction specifications, and finite-path hypotheses.
5.1.10. Concrete Bit Patterns
IEEE32Exec is useful because special values are concrete:
import NN.Floats.IEEEExec.Exec32 open TorchLean.Floats.IEEE754 def plusZero : IEEE32Exec := IEEE32Exec.ofBits (0x00000000 : UInt32) def minusZero : IEEE32Exec := IEEE32Exec.ofBits (0x80000000 : UInt32) def plusInf : IEEE32Exec := IEEE32Exec.ofBits (0x7f800000 : UInt32) def qNaN : IEEE32Exec := IEEE32Exec.ofBits (0x7fc00000 : UInt32) -- Round a Float64 literal to binary32 deterministically in Lean. def third32 : IEEE32Exec := IEEE32Exec.ofFloat (Float.ofBits 0x3fd5555555555555)
For a visual inspection path, open the widget examples:
Widgets API, especially #float32_view and
#float32_round_view.
5.1.11. What Is Proved And What Is Assumed
Checked by Lean:
-
FP32rounded-real definitions and local error lemmas; -
IEEE32Execbit-level definitions for the supported core operations; -
finite bridge theorems from
IEEE32ExectoFP32; -
total
toReal?theorems that expose NaN/Inf paths; -
interval-facing and runtime-approximation lemmas that cite these scalar models.
Named boundaries:
-
Lean runtime
Float32agreement withIEEE32Exec; -
CUDA primitive and kernel agreement with the Lean contracts;
-
cuBLAS, cuFFT, libdevice, compiler, driver, and GPU behavior;
-
correctly-rounded claims for external transcendental libraries;
-
alternative rounding modes and exception flags not modeled by the current theorem path.
That split is the claim. TorchLean treats native execution as an explicit boundary. It makes that boundary small, named, and testable.
5.1.12. How To Choose The Right Layer
Use this rule:
-
Use
ℝfor ideal networks, losses, and verifier statements. -
Use
FP32/NFfor compositional finite-precision error bounds. -
Use
IEEE32Execfor bit inspection, NaN/Inf behavior, signed zeros, and executable binary32 examples. -
Use
RuntimeFloat32MatchesIEEE32Execwhen runtime float results must be transported into the executable reference semantics. -
Use
Float32Contract/KernelSpecwhen native CUDA kernels must be connected to the proof layer.
5.1.13. References
-
IEEE Std 754-2019, Standard for Floating-Point Arithmetic: IEEE 754-2019
-
ISO/IEC/IEEE 60559:2020, the international floating-point arithmetic standard: ISO/IEC/IEEE 60559:2020
-
David Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic: Oracle-hosted reprint
-
Nicholas Higham, Accuracy and Stability of Numerical Algorithms: SIAM book page
-
Flocq, the Coq floating-point formalization: Flocq documentation
-
FloatSpec, a Lean 4 floating-point formalization inspired by Flocq: FloatSpec package
-
Gappa, for certified floating-point bounds and proof generation: Gappa paper
-
CompCert, for verified compilation including machine floating-point models: CompCert commented development
-
NVIDIA, Floating Point and IEEE 754: NVIDIA floating-point guide