NeuralFloat metadata (training phase, named precisions) #
The Flocq-style core model is "rounded arithmetic on ℝ". In TorchLean we sometimes want a little
extra structure around that core:
- a coarse notion of which part of training we're in (forward vs backward vs parameter update),
- named precision levels commonly used in ML (FP16/bfloat16/TF32/FP32/FP64).
We keep these notions in a separate file so that Core.lean can stay focused on the Flocq-style
mantissa/exponent machinery while still letting higher-level layers talk about mixed precision.
Training phases for neural networks.
This is a coarse classifier used by a few mixed-precision policies and specifications; it is not a model of the full optimizer state.
- forward : TrainingPhase
- backward : TrainingPhase
- update : TrainingPhase
- inference : TrainingPhase
Instances For
Phases where we typically want to be more conservative about rounding/error.
Instances For
forward does not request extra precision.
backward requests extra precision (more conservative bounds).
update requests extra precision (more conservative bounds).
inference does not request extra precision.
Named precision levels commonly used in ML.
These carry the intended mantissa/exponent widths. Bit-level IEEE-754 behavior lives elsewhere
(NN/Floats/IEEEExec), and the “finite, rounding-only” float32 semantics used in most proofs is
NN/Floats/FP32.
- brain_float16 : NeuralPrecision
- ieee_half : NeuralPrecision
- ieee_single : NeuralPrecision
- ieee_double : NeuralPrecision
- tensor_float32 : NeuralPrecision
Instances For
Exponent bit width (informational).
Instances For
Stored mantissa (fraction) bit width (informational).
Instances For
Total bit width (sign + exponent + mantissa bits).
Instances For
A common “machine epsilon” proxy: 2^{-mantissa_bits} for binary-like formats.
Instances For
Mixed-precision configuration: which named precision to use in each stage.
This is a convenience record used by a few demos/spec layers; it is not part of the Flocq format
definitions (FIX/FLX/FLT), but it gives a simple way to state “forward in FP16, gradients in
FP32”, etc.
- forward_format : NeuralPrecision
Precision used for the forward pass.
- backward_format : NeuralPrecision
Precision used for the backward pass (gradients/VJPs).
- param_format : NeuralPrecision
Precision used for stored parameters (weights/biases).
- grad_format : NeuralPrecision
Precision used for accumulated gradients.
- loss_format : NeuralPrecision
Precision used for the scalar loss / reductions.
Instances For
A conservative default used by TorchLean demos:
- FP16 forward (for speed),
- FP32 for gradients/params/loss (for stability).