Proof-Linked Session: Convolution and Attention Operations #

source

def Runtime.Autograd.Torch.Internal.SessionIR.conv {α : Type} (s : SessionIR α) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [DecidableEq Spec.Shape] {d inC outC : ℕ} {kernel stride padding inSpatial : Vector ℕ d} {hInC : inC ≠ 0} {hKernel : ∀ (i : Fin d), kernel.get i ≠ 0} (w : TensorRef α (Spec.Shape.ofList (outC :: inC :: kernel.toList))) (b : TensorRef α (Spec.Shape.dim outC Spec.Shape.scalar)) (x : TensorRef α (Spec.Shape.ofList (inC :: inSpatial.toList))) :

IO (TensorRef α (Spec.Shape.ofList (outC :: (Spec.convOutSpatial inSpatial kernel stride padding).toList)))

N-D convolution for channels-first tensors (inC, spatial...) (no batch axis).

Kernel layout is (outC, inC, kernelSpatial...), bias is (outC).

PyTorch comparison: torch.nn.functional.conv{d}d specialized to a single sample.

Instances For

source

def Runtime.Autograd.Torch.Internal.SessionIR.convTranspose {α : Type} (s : SessionIR α) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [DecidableEq Spec.Shape] {d inC outC : ℕ} {kernel stride padding inSpatial : Vector ℕ d} {hInC : inC ≠ 0} {hKernel : ∀ (i : Fin d), kernel.get i ≠ 0} (w : TensorRef α (Spec.Shape.ofList (inC :: outC :: kernel.toList))) (b : TensorRef α (Spec.Shape.dim outC Spec.Shape.scalar)) (x : TensorRef α (Spec.Shape.ofList (inC :: inSpatial.toList))) :

IO (TensorRef α (Spec.Shape.ofList (outC :: (Spec.convTransposeOutSpatial inSpatial kernel stride padding).toList)))

N-D transpose convolution for channels-first tensors (inC, spatial...) (no batch axis).

Kernel layout is (inC, outC, kernelSpatial...) (PyTorch convention), bias is (outC).

PyTorch comparison: torch.nn.functional.conv_transpose{d}d specialized to a single sample.

Instances For

source

def Runtime.Autograd.Torch.Internal.SessionIR.conv2d {α : Type} (s : SessionIR α) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [DecidableEq Spec.Shape] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} (kernel : TensorRef α (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))) (bias : TensorRef α (Spec.Shape.dim outC Spec.Shape.scalar)) (input : TensorRef α (Spec.Shape.dim inC (Spec.Shape.dim inH (Spec.Shape.dim inW Spec.Shape.scalar)))) :

IO (TensorRef α (Spec.Shape.dim outC (Spec.Shape.dim ((inH + 2 * padding - kH) / stride + 1) (Spec.Shape.dim ((inW + 2 * padding - kW) / stride + 1) Spec.Shape.scalar))))

2D convolution for channel-first images (inC, inH, inW) (no batch axis).

Type-level shapes fix the kernel layout (outC, inC, kH, kW) and output spatial dimensions derived from stride and padding. PyTorch comparison: torch.nn.functional.conv2d (conceptually), specialized to a single image.

Instances For

source

def Runtime.Autograd.Torch.Internal.SessionIR.convTranspose2d {α : Type} (s : SessionIR α) [Context α] [DecidableEq Spec.Shape] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} (kernel : TensorRef α (Spec.Shape.dim inC (Spec.Shape.dim outC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))) (bias : TensorRef α (Spec.Shape.dim outC Spec.Shape.scalar)) (input : TensorRef α (Spec.Shape.dim inC (Spec.Shape.dim inH (Spec.Shape.dim inW Spec.Shape.scalar)))) :

IO (TensorRef α (Spec.Shape.dim outC (Spec.Shape.dim ((inH - 1) * stride - 2 * padding + kH) (Spec.Shape.dim ((inW - 1) * stride - 2 * padding + kW) Spec.Shape.scalar))))

2D transpose convolution for channel-first images (inC, inH, inW) (no batch axis).

Kernel layout matches the spec/PyTorch convention (inC, outC, kH, kW). PyTorch comparison: torch.nn.functional.conv_transpose2d specialized to a single image.

Instances For

source

def Runtime.Autograd.Torch.Internal.SessionIR.multiHeadAttention {α : Type} (s : SessionIR α) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [DecidableEq Spec.Shape] {n numHeads dModel headDim : ℕ} (h1 : n ≠ 0) (wq wk wv : TensorRef α (Spec.Shape.dim dModel (Spec.Shape.dim (numHeads * headDim) Spec.Shape.scalar))) (wo : TensorRef α (Spec.Shape.dim (numHeads * headDim) (Spec.Shape.dim dModel Spec.Shape.scalar))) (x : TensorRef α (Spec.Shape.dim n (Spec.Shape.dim dModel Spec.Shape.scalar))) (mask : Option (Spec.Tensor Bool (Spec.Shape.dim n (Spec.Shape.dim n Spec.Shape.scalar))) := none) :

IO (TensorRef α (Spec.Shape.dim n (Spec.Shape.dim dModel Spec.Shape.scalar)))

Multi-head self-attention.

This is a shape-specialized attention primitive used by transformer-style examples:

input x has shape (n, dModel)
wq, wk, wv map dModel → numHeads*headDim
wo maps numHeads*headDim → dModel
optional mask is a boolean (n,n) attention mask

PyTorch comparison: similar to torch.nn.MultiheadAttention / scaled dot-product attention, but encoded in a fully typed IR for compilation/proof linkage.

Instances For

TorchLean API

NN.Runtime.Autograd.Torch.LinkedSession.ConvAttention

Proof-Linked Session: Convolution and Attention Operations #