TorchLean API

NN.Examples.Models.Common.RealData

Shared Real-Data Helpers for Model Examples #

The model examples should exercise real data paths. We keep the shared pieces here:

The data files are prepared by scripts/datasets/download_example_data.py; examples report missing inputs explicitly instead of silently falling back to synthetic tensors.

Number of channels in the prepared CIFAR-10 image tensors.

Instances For

    Height of the prepared CIFAR-10 image tensors.

    Instances For

      Width of the prepared CIFAR-10 image tensors.

      Instances For

        Number of CIFAR-10 classes, hence the width of one-hot targets.

        Instances For

          Default row budget for CIFAR-10 model-zoo commands.

          Instances For

            Number of channels in converted ImageNet-style image tensors.

            Instances For

              Height of converted ImageNet-style image tensors.

              Instances For

                Width of converted ImageNet-style image tensors.

                Instances For

                  Number of ImageNet-style classes expected by the converted label path.

                  Instances For

                    Default row budget for ImageNet64 model-zoo runs.

                    Instances For
                      @[reducible, inline]

                      Shape of one CIFAR-10 image after conversion to CHW layout.

                      Instances For
                        @[reducible, inline]

                        One-hot CIFAR-10 target shape.

                        Instances For

                          Take the top-left h × w view of a CIFAR image batch.

                          Instances For

                            Crop a CIFAR minibatch while leaving the one-hot class labels unchanged.

                            Instances For
                              @[reducible, inline]

                              ImageNet-style converted image shape used by the higher-resolution diffusion example.

                              Instances For
                                @[reducible, inline]

                                One-hot target shape for ImageNet-style folders.

                                The diffusion example ignores labels, but reusing Data.LabeledSource keeps the data path identical to the supervised examples and lets class-directory conversion catch malformed labels early.

                                Instances For

                                  Error message shown when a CIFAR-backed example cannot find the prepared arrays.

                                  Instances For

                                    Error message shown when an ImageNet64-backed example cannot find the prepared arrays.

                                    Instances For

                                      Error message shown when a text-model example cannot find a corpus.

                                      Instances For

                                        Error message shown when the Auto MPG CSV is missing.

                                        Instances For

                                          Error message shown when the household-power forecasting dataset is missing.

                                          Instances For

                                            Default local path for the Tiny Shakespeare corpus.

                                            Instances For

                                              Default local path for the TinyStories validation split.

                                              Instances For

                                                Data-preparation hint for commands that only need Tiny Shakespeare.

                                                Instances For

                                                  Data-preparation hint for commands that accept both Tiny Shakespeare and TinyStories.

                                                  Instances For

                                                    Parse the shared flags for an ImageNet-style 64x64 NPY dataset.

                                                    The expected input is produced by scripts/datasets/torchlean_data_convert.py image-folder; that converter handles JPEG/PNG decoding, RGB conversion, resizing, class-directory labels, and the final NCHW layout. Lean then reads only the simple .npy tensors.

                                                    Instances For
                                                      @[reducible, inline]

                                                      Parsed CIFAR dataset and fixed-sample training flags for runnable model examples.

                                                      Instances For
                                                        @[reducible, inline]

                                                        Parsed CIFAR dataset and optimizer/training flags for classifier examples.

                                                        Instances For

                                                          Parse the standard CIFAR plus fixed-step training flags and reject unused arguments.

                                                          Generative examples use the same prepared CIFAR arrays and the same loss-curve logging contract; only the model and target construction differ.

                                                          Instances For
                                                            def NN.Examples.Models.RealData.CifarModelTrainFlags.parse (exeName : String) (args : List String) (defaultLogPath : System.FilePath) (defaultSteps : := 1) (defaultLr : Float := 1e-3) :

                                                            Parse the standard CIFAR plus optimizer/training flags.

                                                            Vision examples share the same CIFAR data boundary and optimizer controls; architecture files only need to provide the model constructor and logging title. Any remaining arguments are preserved so the caller can forward runtime flags such as --cpu, --cuda, or --backend compiled to the public Trainer.RunConfig parser.

                                                            Instances For

                                                              Common TrainLog notes for CIFAR-backed examples.

                                                              Instances For

                                                                Parse the shared flags for household-power forecasting windows.

                                                                Forecasting commands share --data-dir, --x, --y, --windows, --report-offset, and --seed.

                                                                Instances For
                                                                  @[reducible, inline]

                                                                  Parsed household-power forecasting data plus optimizer/training flags.

                                                                  Instances For
                                                                    def NN.Examples.Models.RealData.HouseholdPowerModelTrainFlags.parse (exeName : String) (args : List String) (defaultLogPath : System.FilePath) (defaultSteps : := 100) (defaultLr : Float := 1e-2) (defaultWindows : := 512) (defaultReportOffset : := 96) :

                                                                    Parse the standard household-power forecasting flags plus optimizer/training flags.

                                                                    The forecasting command still owns the model and reporting logic, but the shared data/runtime flag surface lives here with the other real-data code.

                                                                    Instances For
                                                                      @[reducible, inline]
                                                                      abbrev NN.Examples.Models.RealData.requireSupervisedNpyFiles (exeName xLabel : String) (xPath : System.FilePath) (yLabel : String) (yPath : System.FilePath) (hint : String) :

                                                                      Require that a paired supervised .npy dataset exists before training starts.

                                                                      Instances For
                                                                        @[reducible, inline]

                                                                        Require that a CSV path exists before a tabular regression command starts training.

                                                                        Instances For
                                                                          Instances For

                                                                            Public trainer dataset for prepared CIFAR-10 NPY image/label arrays.

                                                                            Instances For

                                                                              Common training-log notes for CIFAR-backed classifier examples.

                                                                              Instances For
                                                                                def NN.Examples.Models.RealData.cifarCurve (exeName : String) (args : List String) (defaultLogPath : System.FilePath) (defaultSteps : := 10) (banner : TorchLean.OptionsString) (seriesName title : String) (extraNotes : TorchLean.OptionsCifarLoggedTrainFlagsArray String := fun (x : TorchLean.Options) (x_1 : CifarLoggedTrainFlags) => #[]) (train : TorchLean.OptionsCifarLoggedTrainFlagsIO TorchLean.Training.Curve) :

                                                                                Shared main entrypoint for CIFAR-backed curve-reporting commands.

                                                                                Some commands do not match the public trainer result shape because they manage several modules or log one custom scalar curve instead of a single trainer report. They still share the same CIFAR parsing, runtime parsing, CUDA-memory notes, and TrainLog boundary.

                                                                                Instances For

                                                                                  Load one shuffled epoch of full CIFAR-10 minibatches from prepared .npy arrays.

                                                                                  Instances For

                                                                                    Load the first full CIFAR-10 minibatch from the shared CIFAR loader.

                                                                                    Instances For

                                                                                      Load a user-prepared ImageNet-style 64x64 minibatch.

                                                                                      This loader reads prepared .npy arrays rather than JPEG files. The Python converter is the trust boundary for filesystem image decoding and resizing; this Lean path checks the resulting tensor shape and class range before handing the batch to examples.

                                                                                      Instances For

                                                                                        Load one shuffled epoch of full ImageNet64-style minibatches from prepared .npy arrays.

                                                                                        Instances For

                                                                                          Load the first full ImageNet64-style minibatch from the shared ImageNet64 loader.

                                                                                          Instances For

                                                                                            Load a CIFAR minibatch and expose it as a compact flattened vector batch.

                                                                                            The file paths and download hints remain in NN.Examples, while the flattening logic lives in the public generative-model API so users can reuse it with their own image tensors.

                                                                                            Instances For

                                                                                              Public singleton dataset for compact vector generative examples over flattened CIFAR batches.

                                                                                              Autoencoder, VAE, and VQ-VAE examples all load one real CIFAR batch, flatten it to the compact vector boundary, build one supervised sample, and hand that sample to the public trainer API. The sample itself may be Float-specific; this dataset constructor casts it into the runtime-selected scalar so the command still works across the ordinary public runtime backends.

                                                                                              Instances For
                                                                                                @[reducible, inline]

                                                                                                Shared text-corpus CLI/data boundary for local text-model examples.

                                                                                                Instances For

                                                                                                  Parse the shared --data-file flag used by local text-model examples.

                                                                                                  --tiny-shakespeare is accepted as an explicit shortcut for the default corpus path.

                                                                                                  Instances For

                                                                                                    Read the selected text corpus and fail with a shared preparation hint when it is missing.

                                                                                                    Instances For