Matrix tape nodes #

Matrix multiplication, transpose, row/column broadcasting, and row means, with VJP correctness facts stated at the vectorized tape level.

source

@[reducible, inline]

abbrev Proofs.Autograd.TapeNodes.Matmul.matSize (m n : ℕ) :

ℕ

Flattened size of an m×n matrix shape: Shape.size (.dim m (.dim n .scalar)) = m*n.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TapeNodes.Matmul.vecSize (n : ℕ) :

ℕ

Flattened size of a length-n vector shape: Shape.size (.dim n .scalar) = n.

Instances For

source

@[simp]

theorem Proofs.Autograd.TapeNodes.Matmul.vecSize_eq (n : ℕ) :

vecSize n = n

source

def Proofs.Autograd.TapeNodes.Matmul.idxMN {m n : ℕ} (i : Fin m) (j : Fin n) :

Fin (matSize m n)

Convert (i,j) coordinates into a flattened index for an m×n matrix vectorization.

Instances For

source

theorem Proofs.Autograd.TapeNodes.Matmul.toVecT_add_spec_mat {m n : ℕ} (A B : Spec.Tensor ℝ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) :

toVecT (A.addSpec B) = toVecT A + toVecT B

Vectorization commutes with matrix addition: toVecT (A + B) = toVecT A + toVecT B.

source

noncomputable def Proofs.Autograd.TapeNodes.Matmul.matmulVec {m n p : ℕ} (a : Vec (matSize m n)) (b : Vec (matSize n p)) :

Vec (matSize m p)

A bilinear map on flattened matrices: (m×n) × (n×p) → (m×p) on Vec (Shape.size ...).

Instances For

source

@[simp]

theorem Proofs.Autograd.TapeNodes.Matmul.matmulVec_apply {m n p : ℕ} (a : Vec (matSize m n)) (b : Vec (matSize n p)) (ip : Fin (matSize m p)) :

(matmulVec a b).ofLp ip = have hp := ⋯; have i := ip.divNat; have k' := ip.modNat; have k := Fin.cast hp k'; ∑ j : Fin n, a.ofLp (idxMN i j) * b.ofLp (idxMN j k)

Matrix multiplication is developed at the vector level (flattened matrices) to integrate cleanly with CtxVec and the HasFDerivAt machinery.

PyTorch analogue: torch.matmul / @ operator on 2D tensors. https://pytorch.org/docs/stable/generated/torch.matmul.html

source

noncomputable def Proofs.Autograd.TapeNodes.Matmul.matmulCLMRight {m n p : ℕ} (a : Vec (matSize m n)) :

Vec (matSize n p) →L[ℝ ] Vec (matSize m p)

For fixed left operand a, matmulCLMRight a is the linear map b ↦ a*b.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.Matmul.matmulBilin {m n p : ℕ} :

Vec (matSize m n) →L[ℝ ] Vec (matSize n p) →L[ℝ ] Vec (matSize m p)

Continuous bilinear map for matrix multiplication on flattened vectors.

Instances For

source

@[simp]

theorem Proofs.Autograd.TapeNodes.Matmul.matmulBilin_apply {m n p : ℕ} (a : Vec (matSize m n)) (b : Vec (matSize n p)) :

(matmulBilin a) b = matmulVec a b

source

theorem Proofs.Autograd.TapeNodes.Matmul.forward_eq_matmulVec {m n p : ℕ} (aV : Vec (matSize m n)) (bV : Vec (matSize n p)) :

toVecT (Spec.matMulSpec (ofVecT aV) (ofVecT bV)) = matmulVec aV bV

Spec.mat_mul_spec agrees with matmulVec after flattening both inputs/outputs.

source

theorem Proofs.Autograd.TapeNodes.MatTranspose.matSize_eq_mul (m n : ℕ) :

Matmul.matSize m n = m * n

Helper: matSize m n is definitionally m * n.

source

def Proofs.Autograd.TapeNodes.MatTranspose.transposeEquiv (m n : ℕ) :

Fin (m * n) ≃ Fin (n * m)

Equivalence implementing matrix transpose on flattened indices.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.MatTranspose.transposeVec {m n : ℕ} (a : Vec (Matmul.matSize m n)) :

Vec (Matmul.matSize n m)

Transpose on flattened matrices: (m×n) flattened row-major → (n×m) flattened row-major.

Instances For

Transpose is implemented as a coordinate permutation on flattened matrices.

PyTorch analogue: A.transpose(0, 1) for a 2D tensor. https://pytorch.org/docs/stable/generated/torch.transpose.html

source

noncomputable def Proofs.Autograd.TapeNodes.matrixTranspose {Γ : List Spec.Shape} {m n : ℕ} (A : Idx Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) :

Node Γ (Spec.Shape.dim n (Spec.Shape.dim m Spec.Shape.scalar))

Tape node computing matrix transpose: (m×n) ↦ (n×m).

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.matrixTransposeFderiv {Γ : List Spec.Shape} {m n : ℕ} (A : Idx Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) :

NodeFDerivCorrect (matrixTranspose A)

NodeFDerivCorrect for matrix_transpose (it is linear/isometric).

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.matmul {Γ : List Spec.Shape} {m n p : ℕ} (A : Idx Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) (B : Idx Γ (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) :

Node Γ (Spec.Shape.dim m (Spec.Shape.dim p Spec.Shape.scalar))

Matrix multiplication node on 2D tensors.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.matmulFderiv {Γ : List Spec.Shape} {m n p : ℕ} (A : Idx Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) (B : Idx Γ (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) :

NodeFDerivCorrect (matmul A B)

NodeFDerivCorrect for the matrix-matrix multiplication node.

This packages the product rule and the dot/adjointness lemmas for Spec.mat_mul_spec.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.MatrixLinear.broadcastRowCLM {m n : ℕ} :

Vec m →L[ℝ ] Vec (Matmul.matSize m n)

Broadcast a vector v : Vec m across the last axis to a flattened (m×n) matrix.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.MatrixLinear.broadcastColCLM {m n : ℕ} :

Vec n →L[ℝ ] Vec (Matmul.matSize m n)

Broadcast a vector v : Vec n across the first axis to a flattened (m×n) matrix.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.MatrixLinear.rowSumCLM {m n : ℕ} :

Vec (Matmul.matSize m n) →L[ℝ ] Vec m

Row-wise sum: flattened (m×n) matrix → vector m.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.MatrixLinear.rowMeanCLM {m n : ℕ} :

Vec (Matmul.matSize m n) →L[ℝ ] Vec m

Row-wise mean: flattened (m×n) matrix → vector m.

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.broadcastRow {Γ : List Spec.Shape} {m n : ℕ} (idx : Idx Γ (Spec.Shape.dim m Spec.Shape.scalar)) :

Node Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))

Broadcast a vector (.dim m .scalar) across columns to (.dim m (.dim n .scalar)).

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.broadcastRowFderiv {Γ : List Spec.Shape} {m n : ℕ} (idx : Idx Γ (Spec.Shape.dim m Spec.Shape.scalar)) :

NodeFDerivCorrect (broadcastRow idx)

NodeFDerivCorrect for broadcast_row (linear op).

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.broadcastCol {Γ : List Spec.Shape} {m n : ℕ} (idx : Idx Γ (Spec.Shape.dim n Spec.Shape.scalar)) :

Node Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))

Broadcast a vector (.dim n .scalar) across rows to (.dim m (.dim n .scalar)).

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.broadcastColFderiv {Γ : List Spec.Shape} {m n : ℕ} (idx : Idx Γ (Spec.Shape.dim n Spec.Shape.scalar)) :

NodeFDerivCorrect (broadcastCol idx)

NodeFDerivCorrect for broadcast_col (linear op).

Instances For

Shape-only nodes (reshape, flatten, and similar) live in NN.Proofs.Autograd.Tape.Nodes.Shape (namespace TapeNodes.ShapeOps).

source

noncomputable def Proofs.Autograd.TapeNodes.rowMean {Γ : List Spec.Shape} {m n : ℕ} (idx : Idx Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) :

Node Γ (Spec.Shape.dim m Spec.Shape.scalar)

Row-wise mean (reduce last axis): (.dim m (.dim n .scalar)) → (.dim m .scalar).

Instances For

source

noncomputable def Proofs.Autograd.TapeNodes.rowMeanFderiv {Γ : List Spec.Shape} {m n : ℕ} (idx : Idx Γ (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))) :

NodeFDerivCorrect (rowMean idx)

NodeFDerivCorrect for row_mean (reduce-mean along the last axis).

Instances For

TorchLean API

NN.Proofs.Autograd.Tape.Nodes.Matrix

Matrix tape nodes #