Proof-Linked Session: Convolution and Attention Operations #
N-D convolution for channels-first tensors (inC, spatial...) (no batch axis).
Kernel layout is (outC, inC, kernelSpatial...), bias is (outC).
PyTorch comparison: torch.nn.functional.conv{d}d specialized to a single sample.
Instances For
N-D transpose convolution for channels-first tensors (inC, spatial...) (no batch axis).
Kernel layout is (inC, outC, kernelSpatial...) (PyTorch convention), bias is (outC).
PyTorch comparison: torch.nn.functional.conv_transpose{d}d specialized to a single sample.
Instances For
2D convolution for channel-first images (inC, inH, inW) (no batch axis).
Type-level shapes fix the kernel layout (outC, inC, kH, kW) and output spatial dimensions derived
from stride and padding.
PyTorch comparison: torch.nn.functional.conv2d (conceptually), specialized to a single image.
Instances For
2D transpose convolution for channel-first images (inC, inH, inW) (no batch axis).
Kernel layout matches the spec/PyTorch convention (inC, outC, kH, kW).
PyTorch comparison: torch.nn.functional.conv_transpose2d specialized to a single image.
Instances For
Multi-head self-attention.
This is a shape-specialized attention primitive used by transformer-style examples:
- input
xhas shape(n, dModel) wq,wk,wvmapdModel → numHeads*headDimwomapsnumHeads*headDim → dModel- optional
maskis a boolean(n,n)attention mask
PyTorch comparison: similar to torch.nn.MultiheadAttention / scaled dot-product attention, but
encoded in a fully typed IR for compilation/proof linkage.