TorchLean API

Docs Home Guide Examples Graphs

NN.Runtime.Autograd.Train.IoLoader.Npy

NPY loaders for typed training tensors #

This module implements the small, explicit .npy subset that TorchLean's native training examples need:

NumPy format versions 1 and 2;
little-endian float32 and float64 payloads (<f4, <f8);
C-order arrays directly, and Fortran-order arrays converted to C-order at load time;
typed 1D and 2D tensor views for vectors and matrices.

The loader deliberately stays narrow. It is a runtime bridge for trusted experiment artifacts, not a general NumPy parser and not part of the formal tensor semantics. Keeping it here, under Runtime.Autograd.Train, makes that boundary visible while still giving examples a convenient path from Python-produced arrays into TorchLean tensors.

Reference:

NumPy .npy format documentation: https://numpy.org/doc/stable/reference/generated/numpy.lib.format.html

structure Runtime.Autograd.Train.NpyData :

In-memory representation of a loaded .npy file in TorchLean's supported subset.

values is always flattened in C-order. If the source file declares fortran_order = True, we reorder the payload during parsing and store fortran := false in the returned value so downstream tensor loaders never have to reason about storage order.

dtype : String
Dtype string as stored in the header, for example "<f4" or "<f8".
shape : List ℕ
Logical array shape as stored in the header.
fortran : Bool
Whether the returned flat payload is still Fortran-ordered. This loader returns false.
values : Array Float
Flattened numeric payload, converted to Lean Float values.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.prefixProducts (shape : List ℕ) :

Prefix products of a shape list.

For a shape [d₀, d₁, d₂], this returns [1, d₀, d₀*d₁], which are exactly the Fortran-order strides. We use these strides to convert Fortran storage into TorchLean's ordinary C-order flattening convention.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.prefixProducts.go (acc : ℕ) :

List ℕ → List ℕ

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.idxFortranOfCIdx (shape : List ℕ) (idxC : ℕ) :

Convert a linear C-order index to the corresponding linear Fortran-order index.

Both indices describe the same multi-dimensional coordinate. The difference is only how the coordinate is flattened into a one-dimensional payload.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.idxFortranOfCIdx.go (dims strides : List ℕ) (idx acc : ℕ) :

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.reorderFortranToC (shape : List ℕ) (raw : Array Float) :

Reorder a Fortran-ordered flat array into C-order.

The function is total and defensive: if the file payload is malformed and an index is missing, the missing element is filled with 0.0. The parser checks payload length before calling this function, so that fallback should not happen for accepted files.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.byteAt? (bs : ByteArray) (i : ℕ) :

Safe ByteArray indexing.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.readUInt16LE (bs : ByteArray) (i : ℕ) :

Read a little-endian UInt16 at byte offset i, returning none on out-of-bounds input.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.readUInt32LE (bs : ByteArray) (i : ℕ) :

Read a little-endian UInt32 at byte offset i, returning none on out-of-bounds input.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.readUInt64LE (bs : ByteArray) (i : ℕ) :

Read a little-endian UInt64 at byte offset i, returning none on out-of-bounds input.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.parseShapeValue (s : String) :

Option (List ℕ)

Parse a shape tuple like (3, 4) or (3,) from a NumPy header fragment.

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.parseShapeValue.parseAll (xs : List String) (acc : List ℕ) :

Option (List ℕ)

Instances For

def Runtime.Autograd.Train.IoLoader.Internal.parseHeader (tag hdr : String) :

Result (String × Bool × List ℕ)

Parse the NumPy header dictionary.

We only need three standard fields: descr, fortran_order, and shape. The header format is a Python-literal dictionary padded to an alignment boundary; this parser is intentionally field-oriented rather than a full Python parser.

Instances For

def Runtime.Autograd.Train.parseNpy (tag : String) (bs : ByteArray) :

Parse the bytes of a .npy file into NpyData.

The parser rejects unsupported dtypes, malformed headers, and truncated payloads. That makes loader failures explicit at the trust boundary instead of silently producing tensors with the wrong shape or partial data.

Instances For

def Runtime.Autograd.Train.parseNpyPrefixDim0 (tag : String) (expectedShape : List ℕ) (bs : ByteArray) :

Parse only the requested leading rows of a C-order .npy array.

This supports the common tutorial workflow where a large exported tensor is kept on disk but a run uses only the first n rows. The rank and trailing dimensions must match exactly; only dim 0 may be larger than requested.

The implementation intentionally repeats the small NPY header checks instead of calling parseNpy and slicing afterwards. parseNpy decodes the entire data payload; that is fine for small examples but wasteful when a command asks for a quick prefix of a real image or sequence dataset. Here we read the header, validate that the file layout is compatible with the requested type-level shape, and then decode exactly expectedShape.product elements.

Why C-order only? In row-major NPY files, the first n rows are physically contiguous, so the prefix is exactly the first n * trailingSize elements. In Fortran-order files the same logical prefix is interleaved across the payload, so a cheap prefix decode would be wrong. Rather than silently returning bad rows, we reject Fortran-order prefix loading and ask callers to convert the array to C-order first.

Instances For

def Runtime.Autograd.Train.readNpy (path : System.FilePath) :

IO (Result NpyData)

Read a .npy file from disk and parse it as NpyData.

Instances For

def Runtime.Autograd.Train.readNpyPrefixDim0 (path : System.FilePath) (expectedShape : List ℕ) :

IO (Result NpyData)

Read a .npy file but decode only the requested leading rows.

This is the file-system wrapper around parseNpyPrefixDim0. It still reads the file bytes into memory, but it avoids building a full Array Float for rows the run did not ask to use. The public API.Data layer uses this when a dataset source says "load the first n examples" from a larger exported NPY tensor.

Instances For

def Runtime.Autograd.Train.readNpyVector (path : System.FilePath) (n : ℕ) :

IO (Result (Spec.Tensor Float (Spec.Shape.dim n Spec.Shape.scalar)))

Read a 1D .npy file as a typed TorchLean vector tensor.

The shape check is part of the loader contract: files with the wrong logical size are rejected instead of being reshaped implicitly.

Instances For

def Runtime.Autograd.Train.readNpyMatrix (path : System.FilePath) (m n : ℕ) :

IO (Result (Spec.Tensor Float (Spec.Shape.dim m (Spec.Shape.dim n Spec.Shape.scalar))))

Read a 2D .npy file as a typed TorchLean matrix tensor.

The returned matrix uses the same row-major indexing convention as the rest of the runtime tensor helpers.

Instances For