TorchLean API

Docs Home Guide Examples Graphs

NN.API.Public.Seeded.Core

Seeded model builders #

This module reopens NN.API.nn with the PyTorch-style seeded builder API. It sits on top of the pure builders from NN.API.Public.NN and allocates deterministic initialization seeds for users.

Model Builders and Seeding #

TorchLean keeps initialization randomness explicit so examples are reproducible.

nn.* is the default seeded builder API: layer constructors allocate initialization seeds via nn.M (a deterministic seed stream).
nn.pure.* contains the explicit-seed constructors (proof/reproducibility-friendly).

Typical patterns:

Explicit seeds (best for proofs / reproducibility-sensitive code):
- build with nn.pure.linear ... (seedW := ...) (seedB := ...) etc
- compose with seq! ... / >>>
Script-style “manual seed once”:
- nn.manualSeed seed
- let seed ← nn.nextSeed
- let model := nn.run seed <| nn.Sequential![ ... ]

Note: nn.Sequential lives in Type 2, so it cannot be returned directly from IO. We keep model building pure by drawing a base seed in IO and then calling nn.run.

def NN.API.nn.manualSeed (seed : ℕ) :

PyTorch-like global seeding convenience for seeded model builders.

This sets the global seed stream used by nn.runGlobal / nn.nextSeed.

Instances For

@[reducible, inline]

abbrev NN.API.nn.Embedding :

Embedding table initialization configuration (one-hot / token-distribution inputs).

This is the TorchLean-friendly analogue of torch.nn.Embedding in the common setting where token ids are represented as one-hot vectors (or soft token distributions), so lookup is a matrix multiplication rather than integer indexing.

Instances For

@[reducible, inline]

abbrev NN.API.nn.LearnedPositionalEmbedding :

Learned positional embedding configuration.

This is a trainable parameter tensor of shape (seqLen × embedDim) that is broadcast across the leading batch dimension and added to the input.

Instances For

@[reducible, inline]

abbrev NN.API.nn.SinusoidalPositionalEncoding :

Sinusoidal positional encoding configuration.

This is the classic (non-trainable) Transformer sinusoidal encoding, added to token embeddings. startPos is an absolute-position offset (useful for KV-cache decoding).

Instances For

@[reducible, inline]

abbrev NN.API.nn.RoPE :

Rotary positional embedding (RoPE) configuration.

startPos is an absolute-position offset (useful for KV-cache decoding).

Instances For

@[reducible, inline]

abbrev NN.API.nn.Conv2d :

Named-field Conv2d configuration (CHW layout).

This is the public, PyTorch-like entry point for convolution in TorchLean. PyTorch analogue: torch.nn.Conv2d. See https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.Conv :

Named-field Conv2d configuration (CHW layout).

This is the public, PyTorch-like entry point for convolution in TorchLean. PyTorch analogue: torch.nn.Conv2d. See https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.MaxPool2d :

MaxPool2d configuration for CHW inputs.

PyTorch analogue: torch.nn.MaxPool2d. See https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.MaxPool :

MaxPool2d configuration for CHW inputs.

PyTorch analogue: torch.nn.MaxPool2d. See https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.AvgPool2d :

AvgPool2d configuration for CHW inputs.

PyTorch analogue: torch.nn.AvgPool2d. See https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.AvgPool :

AvgPool2d configuration for CHW inputs.

PyTorch analogue: torch.nn.AvgPool2d. See https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.LayerNorm :

LayerNorm configuration for batched (batch x seqLen x embedDim) tensors.

PyTorch analogue: torch.nn.LayerNorm. See https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.RMSNorm :

RMSNorm configuration for batched (batch x seqLen x embedDim) tensors.

This is a common alternative to LayerNorm in modern transformer architectures.

Instances For

@[reducible, inline]

abbrev NN.API.nn.BatchNorm2d :

BatchNorm2d configuration (learned scale/shift).

PyTorch analogue: torch.nn.BatchNorm2d. See https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.InstanceNorm2d :

InstanceNorm2d configuration (learned scale/shift).

PyTorch analogue: torch.nn.InstanceNorm2d. See https://pytorch.org/docs/stable/generated/torch.nn.InstanceNorm2d.html.

Instances For

@[reducible, inline]

abbrev NN.API.nn.MultiheadAttention :

Multi-head self-attention configuration.

PyTorch analogue: torch.nn.MultiheadAttention (conceptually). See https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html.

Instances For

def NN.API.nn.globalAvgPoolCHW (c h w : ℕ) {hC : c > 0} {hH : h > 0} {hW : w > 0} :

TorchLean.NN.Seq (Tensor.Shape.CHW c h w) (Tensor.Shape.Vec c)

Global average pooling over a CHW tensor.

PyTorch analogue: torch.nn.AdaptiveAvgPool2d((1, 1)) followed by flattening.

Instances For

def NN.API.nn.globalAvgPoolNCHW (n c h w : ℕ) {hN : n > 0} {hC : c > 0} {hH : h > 0} {hW : w > 0} :

TorchLean.NN.Seq (Tensor.Shape.NCHW n c h w) (Spec.Shape.dim n (Spec.Shape.dim c Spec.Shape.scalar))

Global average pooling over an NCHW tensor (preserves the batch dimension).

Instances For

Seeded Builders (Default `nn.*`) #

For end-user code, the default nn.* layer constructors allocate initialization seeds automatically via nn.M (a deterministic seed-stream builder).

Use nn.pure.* when you want to pass explicit seeds (proof-friendly / fully reproducible).

@[reducible, inline]

abbrev NN.API.nn.M (α : Type u_1) :

Type u_1

Seeded builder monad: a state monad over API.rand.SeedStream.

Instances For

def NN.API.nn.run {α : Type 2} (seed : ℕ) (x : M α) :

α

Run a seeded builder starting from a base seed.

Instances For

def NN.API.nn.lift {α : Type 2} (x : α) :

M α

Lift a pure value into the seeded builder (consumes no seeds).

Instances For

def NN.API.nn.withSeed {α : Type 2} (k : ℕ → α) :

M α

Consume one fresh seed and pass it to k.

Instances For

def NN.API.nn.withSeeds2 {α : Type 2} (k : ℕ → ℕ → α) :

M α

Consume two fresh seeds and pass them to k (in order).

Instances For

def NN.API.nn.relu {s : Spec.Shape} :

M (Sequential s s)

Elementwise ReLU. PyTorch analogue: torch.nn.ReLU / torch.nn.functional.relu.

Instances For

@[reducible, inline]

abbrev NN.API.nn.ReLU {s : Spec.Shape} :

M (Sequential s s)

Elementwise ReLU. PyTorch analogue: torch.nn.ReLU / torch.nn.functional.relu.

Instances For

def NN.API.nn.silu {s : Spec.Shape} :

M (Sequential s s)

Elementwise SiLU/Swish. PyTorch analogue: torch.nn.SiLU / torch.nn.functional.silu.

Instances For

@[reducible, inline]

abbrev NN.API.nn.SiLU {s : Spec.Shape} :

M (Sequential s s)

Elementwise SiLU/Swish. PyTorch analogue: torch.nn.SiLU / torch.nn.functional.silu.

Instances For

def NN.API.nn.gelu {s : Spec.Shape} :

M (Sequential s s)

Elementwise GELU. PyTorch analogue: torch.nn.GELU / torch.nn.functional.gelu.

Instances For

@[reducible, inline]

abbrev NN.API.nn.GELU {s : Spec.Shape} :

M (Sequential s s)

Elementwise GELU. PyTorch analogue: torch.nn.GELU / torch.nn.functional.gelu.

Instances For

def NN.API.nn.sigmoid {s : Spec.Shape} :

M (Sequential s s)

Elementwise sigmoid. PyTorch analogue: torch.nn.Sigmoid / torch.nn.functional.sigmoid.

Instances For

@[reducible, inline]

abbrev NN.API.nn.Sigmoid {s : Spec.Shape} :

M (Sequential s s)

Elementwise sigmoid. PyTorch analogue: torch.nn.Sigmoid / torch.nn.functional.sigmoid.

Instances For

def NN.API.nn.tanh {s : Spec.Shape} :

M (Sequential s s)

Elementwise tanh. PyTorch analogue: torch.nn.Tanh / torch.nn.functional.tanh.

Instances For

@[reducible, inline]

abbrev NN.API.nn.Tanh {s : Spec.Shape} :

M (Sequential s s)

Elementwise tanh. PyTorch analogue: torch.nn.Tanh / torch.nn.functional.tanh.

Instances For

def NN.API.nn.softmax {s : Spec.Shape} :

M (Sequential s s)

Softmax. PyTorch analogue: torch.nn.Softmax / torch.nn.functional.softmax.

Instances For

@[reducible, inline]

abbrev NN.API.nn.Softmax {s : Spec.Shape} :

M (Sequential s s)

Softmax. PyTorch analogue: torch.nn.Softmax / torch.nn.functional.softmax.

Instances For

def NN.API.nn.sum {s : Spec.Shape} :

M (Sequential s Spec.Shape.scalar)

Reduce-sum to a scalar. PyTorch analogue: torch.sum.

Instances For

def NN.API.nn.flatten {s : Spec.Shape} :

M (Sequential s (Spec.Shape.dim s.size Spec.Shape.scalar))

Flatten any tensor into a 1D vector of length size s. PyTorch analogue: torch.flatten.

Instances For

@[reducible, inline]

abbrev NN.API.nn.Flatten {s : Spec.Shape} :

M (Sequential s (Spec.Shape.dim s.size Spec.Shape.scalar))

Flatten any tensor into a 1D vector of length size s. PyTorch analogue: torch.flatten.

Instances For

def NN.API.nn.flattenBatch {n : ℕ} {s : Spec.Shape} :

M (Sequential (Spec.Shape.dim n s) (Tensor.Shape.Mat n s.size))

Flatten a batched tensor N × σ into a matrix N × (size σ).

PyTorch analogue: torch.flatten(x, start_dim=1).

Instances For

@[reducible, inline]

abbrev NN.API.nn.FlattenBatch {n : ℕ} {s : Spec.Shape} :

M (Sequential (Spec.Shape.dim n s) (Tensor.Shape.Mat n s.size))

Flatten a batched tensor N × σ into a matrix N × (size σ).

PyTorch analogue: torch.flatten(x, start_dim=1).

Instances For

def NN.API.nn.flattenStart1 {n : ℕ} {s : Spec.Shape} :

M (Sequential (Spec.Shape.dim n s) (Tensor.Shape.Mat n s.size))

Flatten a batched tensor starting at dimension 1 (keep dim0).

Synonym for flattenBatch, matching PyTorch’s start_dim=1 wording.

Instances For

def NN.API.nn.maxPool2d {n inC inH inW : ℕ} (cfg : MaxPool2d) [NeZero cfg.kH] [NeZero cfg.kW] :

M (Sequential (Tensor.Shape.Images n inC inH inW) (Tensor.Shape.Images n inC ((inH - cfg.kH) / cfg.stride + 1) ((inW - cfg.kW) / cfg.stride + 1)))

MaxPool2d using NeZero to hide nonzero kernel proofs.

Instances For

def NN.API.nn.maxPool {n inC inH inW : ℕ} (cfg : MaxPool) [NeZero cfg.kH] [NeZero cfg.kW] :

M (Sequential (Tensor.Shape.Images n inC inH inW) (Tensor.Shape.Images n inC ((inH - cfg.kH) / cfg.stride + 1) ((inW - cfg.kW) / cfg.stride + 1)))

Max pooling over batched CHW images, allocating any required initialization seeds automatically.

Shorthand for maxPool2d.

Instances For

def NN.API.nn.avgPool2d {n inC inH inW : ℕ} (cfg : AvgPool2d) [NeZero cfg.kH] [NeZero cfg.kW] :

M (Sequential (Tensor.Shape.Images n inC inH inW) (Tensor.Shape.Images n inC ((inH - cfg.kH) / cfg.stride + 1) ((inW - cfg.kW) / cfg.stride + 1)))

AvgPool2d over batched NCHW inputs (shape N×C×H×W, like PyTorch).

Instances For

def NN.API.nn.avgPool {n inC inH inW : ℕ} (cfg : AvgPool) [NeZero cfg.kH] [NeZero cfg.kW] :

M (Sequential (Tensor.Shape.Images n inC inH inW) (Tensor.Shape.Images n inC ((inH - cfg.kH) / cfg.stride + 1) ((inW - cfg.kW) / cfg.stride + 1)))

Average pooling over batched CHW images, allocating any required initialization seeds automatically.

Shorthand for avgPool2d.

Instances For

def NN.API.nn.linear (inDim outDim : ℕ) (pfx : Spec.Shape := Spec.Shape.scalar) :

M (Sequential (pfx.appendDim inDim) (pfx.appendDim outDim))

Linear layer on the last axis (prefix-shape preserving).

PyTorch analogue: torch.nn.Linear. See https://pytorch.org/docs/stable/generated/torch.nn.Linear.html.

Unlike the runtime TorchLean layer constructor (which is vector-only), this public layer constructor follows PyTorch’s convention:

if x has shape [..., inDim], linear inDim outDim returns a model of shape [..., outDim].

The leading “prefix” dimensions are treated as a batch (they are flattened to (numel(prefix), inDim), the affine map is applied once, and the result is reshaped back).

Instances For

@[reducible, inline]

abbrev NN.API.nn.Linear (inDim outDim : ℕ) (pfx : Spec.Shape := Spec.Shape.scalar) :

M (Sequential (pfx.appendDim inDim) (pfx.appendDim outDim))

Linear layer on the last axis (prefix-shape preserving).

PyTorch analogue: torch.nn.Linear. See https://pytorch.org/docs/stable/generated/torch.nn.Linear.html.

Unlike the runtime TorchLean layer constructor (which is vector-only), this public layer constructor follows PyTorch’s convention:

if x has shape [..., inDim], linear inDim outDim returns a model of shape [..., outDim].

The leading “prefix” dimensions are treated as a batch (they are flattened to (numel(prefix), inDim), the affine map is applied once, and the result is reshaped back).

Instances For

def NN.API.nn.linearV (inDim outDim : ℕ) :

M (Sequential (Tensor.Shape.Vec inDim) (Tensor.Shape.Vec outDim))

Vector-only linear layer, specialized to the scalar prefix shape.

Instances For

@[reducible, inline]

abbrev NN.API.nn.LinearV (inDim outDim : ℕ) :

M (Sequential (Tensor.Shape.Vec inDim) (Tensor.Shape.Vec outDim))

Vector-only linear layer, specialized to the scalar prefix shape.

Instances For

def NN.API.nn.rnn (seqLen inputSize hiddenSize : ℕ) :

M (Sequential (Tensor.Shape.Mat seqLen inputSize) (Tensor.Shape.Mat seqLen hiddenSize))

Vanilla RNN layer (time-major sequence, no batch axis).

Semantics: h_t = tanh(W [x_t; h_{t-1}] + b), with h_{-1} = 0.

This is implemented by unrolling seqLen steps using existing TorchLean ops, so it runs on both CPU and CUDA backends.

PyTorch analogy: torch.nn.RNN(inputSize, hiddenSize, nonlinearity="tanh") with batch_first=false, specialized to a single batch element.

Instances For

def NN.API.nn.gru (seqLen inputSize hiddenSize : ℕ) :

M (Sequential (Tensor.Shape.Mat seqLen inputSize) (Tensor.Shape.Mat seqLen hiddenSize))

GRU layer (time-major sequence, no batch axis).

This is implemented by unrolling seqLen steps using existing TorchLean ops, so it runs on both CPU and CUDA backends.

PyTorch analogy: torch.nn.GRU(inputSize, hiddenSize) with batch_first=false, specialized to a single batch element.

Instances For

def NN.API.nn.mamba (seqLen inputSize hiddenSize : ℕ) :

M (Sequential (Tensor.Shape.Mat seqLen inputSize) (Tensor.Shape.Mat seqLen hiddenSize))

Trainable Mamba-style gated diagonal state-space layer.

The layer is time-major and single-batch, matching the simple rnn/gru/lstm constructors: input (seqLen × inputSize), output (seqLen × hiddenSize). It is unrolled with differentiable TorchLean ops, so CPU and CUDA training use the same API.

Instances For

def NN.API.nn.lstm (seqLen inputSize hiddenSize : ℕ) :

M (Sequential (Tensor.Shape.Mat seqLen inputSize) (Tensor.Shape.Mat seqLen hiddenSize))

LSTM layer (time-major sequence, no batch axis).

This is implemented by unrolling seqLen steps using existing TorchLean ops, so it runs on both CPU and CUDA backends.

PyTorch analogy: torch.nn.LSTM(inputSize, hiddenSize) with batch_first=false, specialized to a single batch element.

Instances For

def NN.API.nn.conv2d {n inC inH inW : ℕ} (cfg : Conv2d) [NeZero inC] [NeZero cfg.kH] [NeZero cfg.kW] :

M (Sequential (Tensor.Shape.Images n inC inH inW) (Tensor.Shape.Images n cfg.outC ((inH + 2 * cfg.padding - cfg.kH) / cfg.stride + 1) ((inW + 2 * cfg.padding - cfg.kW) / cfg.stride + 1)))

2D convolution over a batched image tensor (shape N×C×H×W, like PyTorch).

Instances For

def NN.API.nn.conv {n inC inH inW : ℕ} (cfg : Conv) [NeZero inC] [NeZero cfg.kH] [NeZero cfg.kW] :

M (Sequential (Tensor.Shape.Images n inC inH inW) (Tensor.Shape.Images n cfg.outC ((inH + 2 * cfg.padding - cfg.kH) / cfg.stride + 1) ((inW + 2 * cfg.padding - cfg.kW) / cfg.stride + 1)))

Convolution over batched CHW images, allocating initialization seeds automatically.

Shorthand for conv2d.

Instances For

def NN.API.nn.batchNorm2d {n c h w : ℕ} (cfg : BatchNorm2d := { }) [NeZero n] [NeZero c] [NeZero h] [NeZero w] :

M (Sequential (Tensor.Shape.Images n c h w) (Tensor.Shape.Images n c h w))

BatchNorm2d over NCHW inputs, using NeZero to hide the positivity proofs.

PyTorch analogue: torch.nn.BatchNorm2d. See https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html.

Instances For

def NN.API.nn.batchNorm {n c h w : ℕ} (cfg : BatchNorm2d := { }) [NeZero n] [NeZero c] [NeZero h] [NeZero w] :

M (Sequential (Tensor.Shape.Images n c h w) (Tensor.Shape.Images n c h w))

BatchNorm over batched CHW images, allocating initialization seeds automatically.

Shorthand for batchNorm2d.

Instances For

def NN.API.nn.embedding (vocab embedDim : ℕ) (cfg : Embedding := { }) {pfx : Spec.Shape} :

M (Sequential (pfx.appendDim vocab) (pfx.appendDim embedDim))

Embedding layer for one-hot / token-distribution inputs (no bias).

Input shape: [..., vocab] Output shape: [..., embedDim]

PyTorch analogue: conceptually nn.Embedding(vocab, embedDim) but applied to one-hot inputs.

Instances For

def NN.API.nn.sinusoidalPositionalEncoding {batch seqLen embedDim : ℕ} (cfg : SinusoidalPositionalEncoding := { }) :

M (Sequential (Spec.Shape.dim batch (Tensor.Shape.Mat seqLen embedDim)) (Spec.Shape.dim batch (Tensor.Shape.Mat seqLen embedDim)))

Add sinusoidal positional encodings to a batched (batch × seqLen × embedDim) tensor.

Implementation:

precompute PE : (seqLen × embedDim) at initialization time (stored as a non-trainable buffer),
broadcast it across the leading batch axis and add to the input.

Instances For

def NN.API.nn.rope {batch numHeads seqLen headDim : ℕ} (cfg : RoPE := { }) :

M (Sequential (Spec.Shape.dim batch (Spec.Shape.dim numHeads (Tensor.Shape.Mat seqLen headDim))) (Spec.Shape.dim batch (Spec.Shape.dim numHeads (Tensor.Shape.Mat seqLen headDim))))

Apply RoPE to a batched multi-head tensor (batch × numHeads × seqLen × headDim).

This matches the standard identity:

rope(x) = x * cos + rotatePairs(x) * sin

where cos/sin depend only on (pos, dim) and broadcast across (batch, numHeads).

Notes:

This layer is differentiable (gradients flow through the rotation), but it has no trainable parameters; the precomputed cos/sin tables are stored as non-trainable buffers.
The pure spec version is in NN.Spec.Layers.PositionalEncoding (Spec.rope_apply_heads_spec).

Instances For

def NN.API.nn.learnedPositionalEmbedding {batch seqLen embedDim : ℕ} (cfg : LearnedPositionalEmbedding := { }) :

M (Sequential (Spec.Shape.dim batch (Tensor.Shape.Mat seqLen embedDim)) (Spec.Shape.dim batch (Tensor.Shape.Mat seqLen embedDim)))

Add learned positional embeddings to a batched (batch × seqLen × embedDim) tensor.

PyTorch analogue: x + pos[:seqLen] where pos is a parameter table.

Instances For

def NN.API.nn.layerNorm {batch seqLen embedDim : ℕ} [NeZero seqLen] [NeZero embedDim] :

M (Sequential (Spec.Shape.dim batch (Tensor.Shape.Mat seqLen embedDim)) (Spec.Shape.dim batch (Tensor.Shape.Mat seqLen embedDim)))

Layer normalization over (batch × seqLen × embedDim) tensors.

This normalizes each embedDim-vector (per batch element, per sequence position), and applies learned affine parameters gamma and beta.

PyTorch analogue: torch.nn.LayerNorm(embedDim) on a tensor shaped (batch, seqLen, embedDim).

Implementation note: TorchLean uses NeZero to ensure seqLen and embedDim are positive, avoiding degenerate shapes.

Instances For

def NN.API.nn.multiheadAttention {batch n dModel : ℕ} [NeZero n] (cfg : MultiheadAttention) (mask : Option (Spec.Tensor Bool (Spec.Shape.dim n (Spec.Shape.dim n Spec.Shape.scalar))) := none) :

M (Sequential (Spec.Shape.dim batch (Tensor.Shape.Mat n dModel)) (Spec.Shape.dim batch (Tensor.Shape.Mat n dModel)))

Multi-head self-attention using NeZero to hide the nonzero sequence length proof.

If mask is provided, it is a boolean attention mask of shape (n × n) (e.g. causal masking).

Instances For

def NN.API.nn.transformerEncoderBlock {batch n dModel : ℕ} [NeZero n] [NeZero dModel] (cfg : blocks.TransformerEncoderBlock) (mask : Option (Spec.Tensor Bool (Spec.Shape.dim n (Spec.Shape.dim n Spec.Shape.scalar))) := none) :

M (Sequential (Spec.Shape.dim batch (Tensor.Shape.Mat n dModel)) (Spec.Shape.dim batch (Tensor.Shape.Mat n dModel)))

Transformer encoder block.

This is transformerEncoderBlockWithMask; pass mask := ... to enable causal masking (or other attention masks).

Instances For

def NN.API.nn.transformerEncoderStack {batch n dModel : ℕ} [NeZero n] [NeZero dModel] (cfg : blocks.TransformerEncoderStack) (mask : Option (Spec.Tensor Bool (Spec.Shape.dim n (Spec.Shape.dim n Spec.Shape.scalar))) := none) :

M (Sequential (Spec.Shape.dim batch (Tensor.Shape.Mat n dModel)) (Spec.Shape.dim batch (Tensor.Shape.Mat n dModel)))

Stack cfg.layers copies of blocks.transformerEncoderBlock.

This is transformerEncoderStackWithMask; pass mask := ... to enable causal masking (or other attention masks).

Instances For

def NN.API.nn.resnetBasicBlock {n inC h w : ℕ} (cfg : blocks.ResNetBasicBlock) [NeZero n] [NeZero inC] [NeZero h] [NeZero w] [NeZero cfg.outC] :

M (Sequential (Tensor.Shape.Images n inC h w) (Tensor.Shape.Images n cfg.outC (if cfg.downsample = true then blocks.down2 h else h) (if cfg.downsample = true then blocks.down2 w else w)))

ResNet-18 style BasicBlock over batched image tensors (N×C×H×W).

Instances For

def NN.API.nn.dropout {s : Spec.Shape} (p : Float) :

M (Sequential s s)

Dropout layer (active in train mode, identity in eval mode).

PyTorch analogue: torch.nn.Dropout.

Instances For

def NN.API.nn.runGlobal {α : Type} (x : M α) :

IO α

Run a seeded builder using the global seed stream set by nn.manualSeed (results in Type).

Note: model values like nn.Sequential live in Type 2, so they cannot be returned from IO. For models, use nn.run with an explicit base seed (obtained from nn.nextSeed).

Instances For

def NN.API.nn.nextSeed :

Draw a fresh base seed from the global seed stream set by nn.manualSeed.

Instances For

def NN.API.nn.nextSeeds (n : ℕ) :

Draw n fresh base seeds from the global seed stream.

Instances For

Naming Convenience #

nn.run / nn.nextSeed are the core primitives, but in user code it is often clearer to read:

“build a model from this seed” (nn.run)
“draw a fresh init seed” (nn.freshSeed)
“build a model using the next global init seed” (nn.withModel)

def NN.API.nn.freshSeed :

Draw a fresh base seed from the global seed stream.

Instances For

def NN.API.nn.withModel {σ τ : Spec.Shape} {β : Type} (mk : M (Sequential σ τ)) (k : Sequential σ τ → IO β) :

IO β

Build a model using the next global seed, then run a continuation.

nn.Sequential lives in Type 2, so executable code passes the model to a continuation rather than returning it directly from IO.

Instances For