TorchLean API

NN.Runtime.Autograd.Train.IoLoader.Npy

NPY loaders for typed training tensors #

This module implements the small, explicit .npy subset that TorchLean's native training examples need:

The loader deliberately stays narrow. It is a runtime bridge for trusted experiment artifacts, not a general NumPy parser and not part of the formal tensor semantics. Keeping it here, under Runtime.Autograd.Train, makes that boundary visible while still giving examples a convenient path from Python-produced arrays into TorchLean tensors.

Reference:

In-memory representation of a loaded .npy file in TorchLean's supported subset.

values is always flattened in C-order. If the source file declares fortran_order = True, we reorder the payload during parsing and store fortran := false in the returned value so downstream tensor loaders never have to reason about storage order.

  • dtype : String

    Dtype string as stored in the header, for example "<f4" or "<f8".

  • shape : List

    Logical array shape as stored in the header.

  • fortran : Bool

    Whether the returned flat payload is still Fortran-ordered. This loader returns false.

  • values : Array Float

    Flattened numeric payload, converted to Lean Float values.

Instances For

    Prefix products of a shape list.

    For a shape [d₀, d₁, d₂], this returns [1, d₀, d₀*d₁], which are exactly the Fortran-order strides. We use these strides to convert Fortran storage into TorchLean's ordinary C-order flattening convention.

    Instances For

      Convert a linear C-order index to the corresponding linear Fortran-order index.

      Both indices describe the same multi-dimensional coordinate. The difference is only how the coordinate is flattened into a one-dimensional payload.

      Instances For

        Reorder a Fortran-ordered flat array into C-order.

        The function is total and defensive: if the file payload is malformed and an index is missing, the missing element is filled with 0.0. The parser checks payload length before calling this function, so that fallback should not happen for accepted files.

        Instances For

          Read a little-endian UInt16 at byte offset i, returning none on out-of-bounds input.

          Instances For

            Read a little-endian UInt32 at byte offset i, returning none on out-of-bounds input.

            Instances For

              Read a little-endian UInt64 at byte offset i, returning none on out-of-bounds input.

              Instances For

                Parse a shape tuple like (3, 4) or (3,) from a NumPy header fragment.

                Instances For

                  Parse the NumPy header dictionary.

                  We only need three standard fields: descr, fortran_order, and shape. The header format is a Python-literal dictionary padded to an alignment boundary; this parser is intentionally field-oriented rather than a full Python parser.

                  Instances For

                    Parse the bytes of a .npy file into NpyData.

                    The parser rejects unsupported dtypes, malformed headers, and truncated payloads. That makes loader failures explicit at the trust boundary instead of silently producing tensors with the wrong shape or partial data.

                    Instances For

                      Parse only the requested leading rows of a C-order .npy array.

                      This supports the common tutorial workflow where a large exported tensor is kept on disk but a run uses only the first n rows. The rank and trailing dimensions must match exactly; only dim 0 may be larger than requested.

                      The implementation intentionally repeats the small NPY header checks instead of calling parseNpy and slicing afterwards. parseNpy decodes the entire data payload; that is fine for small examples but wasteful when a command asks for a quick prefix of a real image or sequence dataset. Here we read the header, validate that the file layout is compatible with the requested type-level shape, and then decode exactly expectedShape.product elements.

                      Why C-order only? In row-major NPY files, the first n rows are physically contiguous, so the prefix is exactly the first n * trailingSize elements. In Fortran-order files the same logical prefix is interleaved across the payload, so a cheap prefix decode would be wrong. Rather than silently returning bad rows, we reject Fortran-order prefix loading and ask callers to convert the array to C-order first.

                      Instances For

                        Read a .npy file from disk and parse it as NpyData.

                        Instances For

                          Read a .npy file but decode only the requested leading rows.

                          This is the file-system wrapper around parseNpyPrefixDim0. It still reads the file bytes into memory, but it avoids building a full Array Float for rows the run did not ask to use. The public API.Data layer uses this when a dataset source says "load the first n examples" from a larger exported NPY tensor.

                          Instances For

                            Read a 1D .npy file as a typed TorchLean vector tensor.

                            The shape check is part of the loader contract: files with the wrong logical size are rejected instead of being reshaped implicitly.

                            Instances For

                              Read a 2D .npy file as a typed TorchLean matrix tensor.

                              The returned matrix uses the same row-major indexing convention as the rest of the runtime tensor helpers.

                              Instances For