TorchLean API

Docs Home Guide Examples Graphs

NN.Runtime.Autograd.Torch.Core.Session

Eager Session and CUDA Bridge #

Session state, CUDA upload/download conversions, parameter synchronization, and tape lifecycle helpers for the eager Torch-style runtime.

Internal Eager Session #

The eager backend keeps one mutable CPU tape, one mutable CUDA tape, the non-differentiable Nat environment, and the map from tape leaves back to trainable parameters. The public session-style API lives in Runtime.Autograd.TorchLean.Session; this module owns the lower-level state it delegates to.

CUDA Bridge (Upload/Download) #

The CUDA eager tape stores float32 device buffers (Runtime.Autograd.Cuda.Buffer) paired with a runtime Shape (Runtime.Autograd.Cuda.AnyBuffer).

The Torch eager front-end still presents the spec-level Tensor α s API, so in CUDA mode we need:

upload: Tensor α s -> Cuda.AnyBuffer (float32, contiguous, row-major)
download: Cuda.AnyBuffer -> Runtime.AnyTensor α

The helper namespace gives CUDA bridge conversions stable call sites and a clear boundary.

class Runtime.Autograd.Torch.Internal.CudaBridge.TensorConv (α : Type) :

Conversions required by the eager CUDA tape path.

toAnyBuffer {s : Spec.Shape} : Spec.Tensor α s → IO Cuda.AnyBuffer
Upload a spec tensor to a CUDA AnyBuffer (float32).
ofAnyBuffer : Cuda.AnyBuffer → IO (AnyTensor α)
Download a CUDA AnyBuffer to a runtime AnyTensor (shape-erased).
toFloat : α → IO Float
Convert a scalar constant to a host Float for CUDA kernels (e.g. scale, axpy).

Instances

Float implementation #

@[implicit_reducible]

instance Runtime.Autograd.Torch.Internal.CudaBridge.instTensorConvFloat :

TensorConv Float

Float CUDA conversions: upload/download via row-major FloatArray.

@[implicit_reducible, instance 10]

instance Runtime.Autograd.Torch.Internal.CudaBridge.instTensorConv (α : Type) :

Generic CPU-preserving fallback for scalar types without a CUDA wire-format bridge.

Many TorchLean sessions are scalar-polymorphic on CPU, while the eager CUDA tape stores float32 buffers. This fallback keeps those CPU instantiations usable and fails loudly if a CUDA upload is requested for a scalar type that has no declared float32 wire representation. Add a higher-priority TensorConv α instance when a scalar type should be allowed onto the CUDA tape.

Shape helpers for CUDA kernels #

def Runtime.Autograd.Torch.Internal.CudaBridge.dimsArray (s : Spec.Shape) :

Runtime dimension list as an Array Nat (outermost-first).

Instances For

def Runtime.Autograd.Torch.Internal.CudaBridge.axisMapArray {s₁ s₂ : Spec.Shape} (cb : s₁.CanBroadcastTo s₂) :

axisMap as an array.

Instances For

def Runtime.Autograd.Torch.Internal.syncParamCudaToHost {α : Type} [CudaBridge.TensorConv α] {sh : Spec.Shape} [DecidableEq Spec.Shape] (p : Param α sh) :

Synchronize a CUDA-updated parameter back to its host tensor, if needed.

This synchronization point is explicit. Training hot paths keep parameters resident on device; public readback APIs call this helper before exposing parameter tensors to the Lean side.

Instances For

def Runtime.Autograd.Torch.Internal.setParamCudaValue {α : Type} {sh : Spec.Shape} (p : Param α sh) (any : Cuda.AnyBuffer) :

Store/update the CUDA mirror of a parameter and mark the host tensor stale.

Instances For

def Runtime.Autograd.Torch.Internal.setParamHostValue {α : Type} {sh : Spec.Shape} (p : Param α sh) (v : Spec.Tensor α sh) :

Overwrite a host parameter value and invalidate any stale CUDA mirror.

Instances For

structure Runtime.Autograd.Torch.Internal.EagerSession (α : Type) :

Internal eager session: a mutable runtime tape plus side tables.

This is the state needed to offer a PyTorch-like API where "tensors" are opaque references and ops mutate a hidden tape stored in an IO.Ref.

Notes:

tape stores values and backward closures (Runtime.Autograd.Tape).
paramsByLeaf remembers which tape leaf ids correspond to trainable parameters (for SGD).
nats stores non-differentiable Nat inputs used for indexing-like operations.

opts : Options
Session options controlling backend/device/kernel behavior.
tape : IO.Ref (Tape α)
CPU eager tape used when opts.useGpu = false.
cudaTape : IO.Ref Cuda.Tape
CUDA eager tape used when opts.useGpu = true.
paramsByLeaf : IO.Ref (Std.HashMap ℕ (AnyParam α))
Map from tape leaf ids to trainable parameter objects.
nats : IO.Ref (Array ℕ)
Non-differentiable integer inputs for dynamic indexing operations.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.new {α : Type} (opts : Options := { }) :

IO (EagerSession α)

Allocate a fresh eager session with an empty tape and empty side tables.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.releaseCudaBuffer (b : Cuda.Buffer) :

Force-free a CUDA buffer allocation; the external finalizer is safe to call twice.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.releaseCudaAnyBuffer (b : Cuda.AnyBuffer) :

Force-release a shape-erased CUDA buffer.

Instances For

CUDA optimizers only need gradients for parameter leaves, not for every intermediate tape node. CudaGradMap is the sparse representation used by long eager CUDA runs: keys are tape node ids for parameter leaves and values are device-resident cotangents with the same shape as that leaf.

@[reducible, inline]

abbrev Runtime.Autograd.Torch.Internal.EagerSession.CudaGradMap :

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.releaseCudaTapeNonParamValues {α : Type} (s : EagerSession α) :

Release current CUDA tape values that are not persistent parameter mirrors.

Eager CUDA training creates ephemeral buffers for forward values and backward workspace. Reset paths call this before discarding the current tape snapshot, while persistent parameter mirrors remain owned by their Param objects.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.releaseCudaTapeAfterOptimizerStep {α : Type} (s : EagerSession α) :

Release CUDA tape values after an optimizer step.

Unlike releaseCudaTapeNonParamValues, this may release trainable parameter leaf buffers too. In a CUDA optimizer step, trainable parameters have already been written to fresh persistent mirrors, so the leaf buffers from the just-consumed tape are stale. Non-trainable parameter leaves still are their persistent mirrors, so we keep those cached.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.releaseCudaAnyBufferArray (xs : Array Cuda.AnyBuffer) :

Release a dense CUDA gradient array after an optimizer has consumed it.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.releaseCudaGradMap (xs : CudaGradMap) :

Release a sparse CUDA gradient map after an optimizer has consumed it.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.checkCudaAnyBufferSize (where_ : String) (x : Cuda.AnyBuffer) :

Check that a shape-erased CUDA buffer has the number of elements promised by its shape.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.ownedCudaAnyBuffer (where_ : String) (x : Cuda.AnyBuffer) :

IO Cuda.AnyBuffer

Make an owned copy of a CUDA buffer after checking its shape metadata.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.collectCudaAllocator :

Ask the native allocator to return/free pages after a large CUDA eager step.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.resetTape {α : Type} (s : EagerSession α) :

Reset the tape and side tables.

PyTorch comparison: like starting a fresh forward pass where the autograd graph is discarded.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.param {α : Type} (s : EagerSession α) {sh : Spec.Shape} (init : Spec.Tensor α sh) (name : Option String := none) (requiresGrad : Option Bool := none) :

IO (Param α sh)

Create a mutable parameter object (not yet on the tape).

To record this parameter on the session tape, call use, which reads the parameter and records it as a leaf.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.getValue {α : Type} [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (x : TensorRef α sh) :

IO (Spec.Tensor α sh)

Read back the concrete tensor value stored at a TensorRef.

This is a dynamic check: we ensure the id exists on the tape and the stored shape matches sh.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.const {α : Type} [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (v : Spec.Tensor α sh) (name : Option String := none) :

IO (TensorRef α sh)

Record a constant leaf (non-differentiable) on the tape.

PyTorch comparison: like constructing a tensor with requires_grad=False.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.detach {α : Type} [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (x : TensorRef α sh) (name : Option String := none) :

IO (TensorRef α sh)

Stop-gradient boundary.

Forward semantics: identity (detach(x) = x). Backward semantics: no gradient flows to x.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.randUniform {α : Type} [Context α] [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (seed : ℕ) (name : Option String := none) :

IO (TensorRef α sh)

Deterministic U[0,1) tensor generator (seeded).

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.bernoulliMask {α : Type} [Context α] [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (keepProb : TensorRef α Spec.Shape.scalar) (seed : ℕ) (name : Option String := none) :

IO (TensorRef α sh)

Deterministic {0,1} mask generator (seeded) with a scalar keep-probability input.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.use {α : Type} [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (p : Param α sh) :

IO (TensorRef α sh)

Use a parameter in the tape by recording its current value as a leaf.

The returned TensorRef is the handle you pass to ops. The leaf id is stored in paramsByLeaf so optimizer steps (e.g. SGD) can update parameters after backward. PyTorch comparison: like using a torch.nn.Parameter in a forward pass (it becomes a leaf in the autograd graph).

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.input {α : Type} [CudaBridge.TensorConv α] (s : EagerSession α) {sh : Spec.Shape} [DecidableEq Spec.Shape] (v : Spec.Tensor α sh) (name : Option String := none) (requiresGrad : Bool := false) :

IO (TensorRef α sh)

Record an external input tensor as a leaf on the tape.

PyTorch comparison: like introducing a tensor into the autograd graph with a chosen requires_grad flag.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.inputNat {α : Type} (s : EagerSession α) (v : ℕ) :

Record a non-differentiable Nat input in the session environment.

This supports ops that depend on indices/labels that should not receive gradients.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.getNat {α : Type} (s : EagerSession α) (r : NatRef) :

Read a previously recorded NatRef.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.setNat {α : Type} (s : EagerSession α) (r : NatRef) (v : ℕ) :

Overwrite a previously recorded NatRef.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.natVecToArray {k : ℕ} (v : Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar)) :

Convert a Tensor Nat (.dim k .scalar) to an Array Nat.

Used to stage NatVecRef values into the session environment.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.inputNatVec {α : Type} {k : ℕ} (s : EagerSession α) (v : Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar)) :

IO (NatVecRef k)

Record a non-differentiable vector of Nats in the session environment.

Returns a NatVecRef k pointing to the stored block.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.getNatVec {α : Type} {k : ℕ} (s : EagerSession α) (r : NatVecRef k) :

IO (Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar))

Read back the vector stored at NatVecRef k.

Instances For

def Runtime.Autograd.Torch.Internal.EagerSession.setNatVec {α : Type} {k : ℕ} (s : EagerSession α) (r : NatVecRef k) (v : Spec.Tensor ℕ (Spec.Shape.dim k Spec.Shape.scalar)) :

Overwrite the stored vector at NatVecRef k.

Instances For