Normalization analysis properties #
This file records theorem-level properties of TorchLean's normalization specs. The executable and
spec definitions live in NN.Spec.Layers.Normalization; this file belongs under
NN.Proofs.Analysis because it proves algebraic facts about those definitions over ℝ.
Current focus:
- BatchNorm inference is affine in the input
xwhen the running statistics are fixed.
That fact matters because inference-time BatchNorm can be folded into a preceding/following affine layer for verification, bound propagation, and simplification. Training-mode BatchNorm has batch-dependent statistics and is intentionally not claimed by this theorem.
BatchNorm inference is affine #
At inference time, BatchNorm uses fixed running mean/variance. The resulting function is affine:
y = x ⊙ scale + bias.
Inference-time BatchNorm is affine in the input x.
This is the public theorem users want for verification and graph simplification:
batchNormInference x runningMean runningVar gamma beta epsilon
is definitionally equal to a pointwise affine map
x * (gamma / std) + (beta - mean * (gamma / std))
after broadcasting channel parameters to the input shape and clamping the running variance exactly as the spec does.