TorchLean API

NN.MLTheory.CROWN.Operators.Batchnorm

BatchNorm operator bounds (IBP + affine) #

This file bounds inference-time BatchNorm. Since inference-time BatchNorm is an affine transformation (with frozen statistics), both IBP and affine propagation are exact (componentwise).

At inference time, y = γ * (x - μ) / sqrt(σ² + ε) + β, so the layer reduces to y = scale * x + offset, where scale = γ / sqrt(σ² + ε) and offset = β - γ * μ / sqrt(σ² + ε).

References:

Parameters for BatchNorm layer (frozen at inference).

Instances For

    Compute the equivalent affine scale: γ / √(σ² + ε)

    Instances For

      Compute the equivalent affine offset: β - γ * μ / √(σ² + ε)

      Instances For

        IBP for BatchNorm: since BN is affine, we can compute exact bounds. y = scale * x + offset, so:

        • If scale > 0: y_lo = scale * x_lo + offset, y_hi = scale * x_hi + offset
        • If scale < 0: y_lo = scale * x_hi + offset, y_hi = scale * x_lo + offset
        Instances For
          def NN.MLTheory.CROWN.Operators.Batchnorm.affBatchnorm {α : Type} [Context α] {inDim : } (params : BatchNormParams α) (aff : AffineVec α inDim params.dim) :
          AffineVec α inDim params.dim

          Affine bounds for BatchNorm propagation. Since BN is itself affine, we simply compose the affine forms: If prev = A_prev * x_in + c_prev and BN = scale * · + offset Then composed = scale * (A_prev * x_in + c_prev) + offset = diag(scale) * A_prev * x_in + (scale * c_prev + offset)

          Instances For

            Derivative bounds for BatchNorm: since BN is affine, d(BN)/dx = scale (constant). Given input derivative bounds [dlo, dhi], output = scale * [dlo, dhi].

            Instances For

              Second derivative of BatchNorm is zero (affine function).

              Instances For

                BatchNorm IBP produces a valid Box structure.

                theorem NN.MLTheory.CROWN.Operators.Batchnorm.Theorems.aff_batchnorm_returns_affine {α : Type} [Context α] {inDim : } (params : BatchNormParams α) (aff : AffineVec α inDim params.dim) :
                ∃ (A' : Spec.Tensor α (Spec.Shape.dim params.dim (Spec.Shape.dim inDim Spec.Shape.scalar))) (c' : Spec.Tensor α (Spec.Shape.dim params.dim Spec.Shape.scalar)), (affBatchnorm params aff).A = A' (affBatchnorm params aff).c = c'

                BatchNorm affine transformation preserves structure.

                BatchNorm derivative IBP produces valid Box.

                theorem NN.MLTheory.CROWN.Operators.Batchnorm.Theorems.deriv2_batchnorm_is_zero {α : Type} [Context α] (params : BatchNormParams α) (d2B : Box α (Spec.Shape.dim params.dim Spec.Shape.scalar)) :
                have result := deriv2Batchnorm params d2B; result.lo = result.hi

                BatchNorm second derivative is zero (affine function).