TorchLean API

Docs Home Guide Examples Graphs

NN.Runtime.Autograd.Engine.Cuda.Ops.Indexing

CUDA Tape Operations: Concatenation, Slicing, and Indexing #

Concat / slice (1D) #

def Runtime.Autograd.Cuda.Tape.concat1d {n m : ℕ} (t : Tape) (aId bId : ℕ) :

Result (Tape × ℕ)

Concatenate two 1D buffers.

Instances For

def Runtime.Autograd.Cuda.Tape.concatVectors {n m : ℕ} (t : Tape) (aId bId : ℕ) :

Result (Tape × ℕ)

Concatenate two 1D tensors (CPU tape name).

Instances For

def Runtime.Autograd.Cuda.Tape.slice1d {n start len : ℕ} (t : Tape) (xId : ℕ) :

Result (Tape × ℕ)

Slice len entries from a one-dimensional CUDA buffer starting at start.

Instances For

Concat / slice along dim 0 #

def Runtime.Autograd.Cuda.Tape.concatDim0 {n m : ℕ} {s : Spec.Shape} (t : Tape) (aId bId : ℕ) :

Result (Tape × ℕ)

Concatenate along dim 0 for tensors with leading dimension (CPU tape name).

Instances For

def Runtime.Autograd.Cuda.Tape.sliceRange0 {n : ℕ} {s : Spec.Shape} (t : Tape) (xId start len : ℕ) (_h : len + start ≤ n) :

Result (Tape × ℕ)

Slice along dim 0: x[start:start+len] (CPU tape name).

Instances For

Gather / scatter (host Nat indices) #

Indices are non-differentiable and remain on the host. Kernels totalize out-of-bounds indices as documented in NN.Runtime.Autograd.Engine.Cuda.Kernels.

def Runtime.Autograd.Cuda.Tape.gatherScalar {n : ℕ} (t : Tape) (xId : ℕ) (i : Fin n) :

Result (Tape × ℕ)

Gather a scalar from a 1D vector using a compile-time index.

Instances For

def Runtime.Autograd.Cuda.Tape.gatherRow {rows cols : ℕ} (t : Tape) (xId : ℕ) (i : Fin rows) :

Result (Tape × ℕ)

Gather a row from a 2D matrix using a compile-time index.

Instances For

def Runtime.Autograd.Cuda.Tape.gatherScalarNat {n : ℕ} (t : Tape) (xId i : ℕ) :

Result (Tape × ℕ)

Gather a scalar from a 1D vector using a runtime Nat index (totalized by the kernel).

Instances For

def Runtime.Autograd.Cuda.Tape.natTensorToIndexArray {k : ℕ} (idx : Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar)) :

Convert a length-k natural-number tensor into the index array expected by CUDA gather/scatter kernels.

Instances For

def Runtime.Autograd.Cuda.Tape.gatherVecNat {n k : ℕ} (t : Tape) (xId : ℕ) (idx : Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar)) :

Result (Tape × ℕ)

Gather k scalars from a length-n vector.

Instances For

def Runtime.Autograd.Cuda.Tape.gatherRowsNat {rows cols k : ℕ} (t : Tape) (xId : ℕ) (idx : Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar)) :

Result (Tape × ℕ)

Gather k rows from a (rows, cols) matrix (row-major).

Instances For

def Runtime.Autograd.Cuda.Tape.scatterAddVec {n : ℕ} (t : Tape) (xId vId : ℕ) (i : Fin n) :

Result (Tape × ℕ)

Scatter-add into a vector: out = x with out[i] += v.

Instances For

def Runtime.Autograd.Cuda.Tape.scatterAddRow {rows cols : ℕ} (t : Tape) (xId vId : ℕ) (i : Fin rows) :

Result (Tape × ℕ)

Scatter-add into a matrix row: out = x with out[i,:] += v.

Instances For