BatchNorm operator bounds (IBP + affine) #
This file bounds inference-time BatchNorm. Since inference-time BatchNorm is an affine transformation (with frozen statistics), both IBP and affine propagation are exact (componentwise).
At inference time,
y = γ * (x - μ) / sqrt(σ² + ε) + β,
so the layer reduces to y = scale * x + offset, where
scale = γ / sqrt(σ² + ε) and offset = β - γ * μ / sqrt(σ² + ε).
References:
- Ioffe and Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", ICML 2015.
- PyTorch analogue:
torch.nn.BatchNorm1d/2d/3din evaluation mode.
Parameters for BatchNorm layer (frozen at inference).
- dim : ℕ
Number of channels/features
- running_mean : Spec.Tensor α (Spec.Shape.dim self.dim Spec.Shape.scalar)
Running mean μ
- running_var : Spec.Tensor α (Spec.Shape.dim self.dim Spec.Shape.scalar)
Running variance σ²
- gamma : Spec.Tensor α (Spec.Shape.dim self.dim Spec.Shape.scalar)
Learnable scale γ
- beta : Spec.Tensor α (Spec.Shape.dim self.dim Spec.Shape.scalar)
Learnable bias β
- eps : α
Small constant for numerical stability
Instances For
Compute the equivalent affine scale: γ / √(σ² + ε)
Instances For
Compute the equivalent affine offset: β - γ * μ / √(σ² + ε)
Instances For
IBP for BatchNorm: since BN is affine, we can compute exact bounds. y = scale * x + offset, so:
- If scale > 0: y_lo = scale * x_lo + offset, y_hi = scale * x_hi + offset
- If scale < 0: y_lo = scale * x_hi + offset, y_hi = scale * x_lo + offset
Instances For
Affine bounds for BatchNorm propagation. Since BN is itself affine, we simply compose the affine forms: If prev = A_prev * x_in + c_prev and BN = scale * · + offset Then composed = scale * (A_prev * x_in + c_prev) + offset = diag(scale) * A_prev * x_in + (scale * c_prev + offset)
Instances For
Derivative bounds for BatchNorm: since BN is affine, d(BN)/dx = scale (constant). Given input derivative bounds [dlo, dhi], output = scale * [dlo, dhi].
Instances For
Second derivative of BatchNorm is zero (affine function).
Instances For
BatchNorm IBP produces a valid Box structure.
BatchNorm affine transformation preserves structure.
BatchNorm derivative IBP produces valid Box.
BatchNorm second derivative is zero (affine function).