TorchLean API

NN.Examples.BugZoo.ShapeAndBroadcast

BugZoo: shape and broadcasting boundaries #

This file records the TorchLean response to a very common bug class: tensor shape mistakes that would normally appear only at runtime, or worse, silently broadcast to the wrong expression.

The empirical motivation is concrete. Wu, Shen, and Chen build SFData, a corpus of 146 crashing tensor-shape bugs, and use missing batch dimensions as a representative pattern:

Wang et al. also describe numerical bugs where a missing keep_dims=True changes a reduction shape; NumPy/PyTorch-style broadcasting then makes a later expression typecheck while computing the wrong loss:

TorchLean makes this case explicit: ordinary elementwise ops require the same shape, and broadcasting requires Shape.CanBroadcastTo evidence. The examples below show the intended workflow. A missing batch dimension is not silently accepted; we write the singleton batch insertion. A reduced vector is not silently expanded back into a matrix; we carry the broadcast proof.

Bug-shaped PyTorch sketch:

# Crashes or silently changes later code depending on where it appears:
image = torch.randn(100, 100, 3)
model(image)          # model expected [1, 100, 100, 3]

# Easy to miss: reduction drops a dimension, then a later op broadcasts it back.
row_sum = x.sum(dim=0)          # [3], not [1, 3]
loss = ((x - row_sum) ** 2).sum()

TorchLean equivalent:

def batched := addSingletonBatch image
def row := reduceRows x
def explicit := broadcastRowToMatrix row

The important part is not the syntax; it is that the shape change and broadcast are named terms with types and proof evidence.

@[reducible, inline]
Instances For

    Insert an explicit singleton batch dimension.

    This is the TorchLean version of the fix for the classic “forgot the batch axis” bug: we do not let Tensor α [100,100,3] masquerade as Tensor α [1,100,100,3]; the user has to name the reshape.

    Instances For
      @[simp]

      Reading the only batch entry after addSingletonBatch gives back the original image.

      @[reducible, inline]
      Instances For
        @[reducible, inline]
        Instances For

          Sum over the outer axis of a 2 × 3 tensor, dropping that axis and producing a row vector.

          Instances For

            Evidence that a row vector can be broadcast back across the outer dimension of a 2 × 3 matrix.

            This is exactly the piece TorchLean wants users and proof scripts to make visible: if a reduction dropped a dimension, any later expansion is an explicit broadcast, not an accidental side effect.

            Instances For

              Broadcast a row vector to every row of a 2 × 3 matrix, using the evidence above.

              Instances For
                @[simp]

                The first row of an explicit broadcast is definitionally the original row.