TorchLean API

NN.Runtime.Training.Log

Training Logs #

Pure training-log data plus the stable JSON codec used by examples, widgets, and CLI artifacts.

The data structures in this file are deliberately small:

For W&B-style workflows, ExperimentLog adds run metadata, config entries, tags, and artifact references while still lowering to the same stable TrainLog JSON that TorchLean widgets already know how to render. This gives examples one clean local artifact format and a straightforward bridge to hosted trackers.

We keep JSON support in the same module as the data model so there is one canonical logging API: Runtime.Training.TrainLog.writeJson / readJson. Widgets live in NN.Widgets.Runtime.Training and render these logs in the infoview.

Core Artifact Model #

TorchLean logs are intentionally local-first. The shape mirrors the small useful subset of hosted experiment trackers such as Weights & Biases:

The data stays simple and serializes to ordinary JSON, so examples can write it without any network service or background process.

A string-valued hyperparameter/config entry attached to a run.

  • key : String

    Config key, for example "lr", "optimizer", or "dataset".

  • value : String

    Human-readable value. Keep this stringly-typed so JSON artifacts stay tool-agnostic.

Instances For

    A file artifact produced or consumed by a training run.

    • name : String

      Short artifact name, for example "checkpoint" or "predictions_csv".

    • Local path to the artifact. External trackers may use this path when uploading artifacts.

    • kind : String

      Optional kind, such as "model", "plot", "dataset", or "report".

    • description : String

      Optional human-readable description.

    Instances For

      Run metadata for a local experiment.

      This is deliberately close to the mental model of W&B (project, name, group, tags, config), but it remains a plain Lean value and does not perform network IO.

      • project : String

        Project or collection name. Examples usually use "torchlean".

      • name : String

        Display name for this run. Defaults to the resulting TrainLog.title if left empty.

      • runId : String

        Stable run identifier, useful when several artifacts belong to one run.

      • group : String

        Optional group/sweep name.

      • tags : Array String

        Tags for filtering/comparing runs.

      • String-valued hyperparameters and data/runtime choices.

      Instances For

        A named scalar metric series over training steps (e.g. loss/accuracy/LR).

        • name : String

          Display name (e.g. "loss" or "val_acc").

        • values : Array Float

          Scalar values over steps.

        • color : String

          CSS color hint used by viewers.

        Instances For

          A small multi-series training log (curves + optional notes).

          • title : String

            Title shown at the top of viewers.

          • steps : Array Nat

            Optional step indices.

            If empty, viewers typically use 0..(n-1).

          • series : Array Series

            One or more metric series (should have compatible lengths).

          • notes : Array String

            Free-form notes (hyperparameters, dataset, run id, etc.).

          Instances For
            def Runtime.Training.TrainLog.beforeAfterLoss (title : String) (steps : Nat) (loss0 loss1 : Float) (notes : Array String := #[]) (color : String := "#4e79a7") :

            Build a two-point loss log from an initial and final scalar loss.

            This is useful for any training routine that records a baseline loss at step 0 and another loss after steps updates. More detailed loops should use Curve or MetricHistory below.

            Instances For

              A single scalar curve over discrete training steps.

              Use this when a training routine records one scalar repeatedly, such as training loss, validation loss, learning rate, or reward. Convert it to TrainLog for JSON persistence and widgets.

              We intentionally store curves as Arrays (not tensors):

              • curve lengths are runtime-dependent,
              • curves are persisted as JSON for widgets, and
              • typed tensors are reserved for fixed-shape model inputs/outputs.
              Instances For
                def Runtime.Training.Curve.push (c : Curve) (step : Nat) (value : Float) :

                Append one point (step, value) to a curve.

                Instances For
                  def Runtime.Training.Curve.toTrainLog (c : Curve) (title seriesName : String) (color : String := "#4e79a7") (notes : Array String := #[]) :

                  Convert a single curve into a TrainLog with one series.

                  This matches the expectations of TorchLean's widgets (#train_log_file_view).

                  Instances For

                    Multi-metric histories #

                    Curve is perfect for a single scalar, but many training runs record several aligned metrics: train loss, validation loss, accuracy, learning rate, reward, and so on. MetricHistory stores that common table-shaped history and converts it to the stable TrainLog artifact consumed by JSON IO and widgets.

                    A multi-series scalar history aligned by step.

                    • steps : Array Nat

                      Step indices shared by every series.

                    • series : Array Series

                      Metric series. Each push appends one value to each series.

                    Instances For

                      Construct an empty history from (metric name, color) pairs.

                      Instances For

                        Append one row of metric values.

                        If values.size does not match series.size, the update is ignored. This keeps the function total and avoids manufacturing malformed logs; callers that want a hard error can check sizes before calling.

                        Instances For

                          Convert a multi-metric history into a stable TrainLog artifact.

                          Instances For

                            Confusion matrix for classification evaluation.

                            • counts : Array (Array Nat)

                              Row-major counts: counts[i][j] means “true class i predicted as j”.

                            Instances For

                              A W&B-shaped local experiment artifact.

                              ExperimentLog is the richer authoring object. Use toTrainLog before writing JSON or rendering in widgets. The conversion stores metadata as structured notes so existing TrainLog consumers keep working without a second widget protocol.

                              Instances For

                                Experiment-Tracker Helpers #

                                Render run metadata and config as stable notes inside a TrainLog.

                                Instances For

                                  Render an artifact reference as a stable note inside a TrainLog.

                                  Instances For

                                    Start a local experiment with optional metadata.

                                    Instances For
                                      def Runtime.Training.ExperimentLog.log (e : ExperimentLog) (step : Nat) (name : String) (value : Float) (color : String := "#4e79a7") :

                                      Append one scalar metric. New metric names are added lazily with a default color.

                                      Instances For

                                        Append one aligned row of metrics to an experiment.

                                        Instances For

                                          Attach a local file artifact to the run.

                                          Instances For

                                            Convert a rich experiment object into the stable widget/JSON TrainLog artifact.

                                            Instances For

                                              Logging Destinations #

                                              Examples should make logging explicit. LogDestination.disabled is the local equivalent of wandb disabled; LogDestination.json path writes the standard TorchLean JSON artifact.

                                              Where a training routine should send its log artifact.

                                              Instances For

                                                Is logging enabled for this destination?

                                                Instances For

                                                  Return the JSON path when this destination writes one.

                                                  Instances For

                                                    Return the JSON path, falling back to defaultPath when logging is disabled.

                                                    Instances For

                                                      Parse a CLI logging value.

                                                      Accepted disabled values are false, off, none, no, and disabled. Any other value is treated as a JSON path.

                                                      Instances For

                                                        Parse an optional CLI value, using the default JSON path when no value is supplied.

                                                        Instances For

                                                          JSON Codec #

                                                          Training logs are often produced by executable examples and then rendered later by widgets. The codec below is intentionally stable and human-readable:

                                                          This is not intended to be a high-throughput experiment tracker. It is a small artifact format that keeps examples, tests, widgets, and external tooling speaking the same language.

                                                          Primitive arrays #

                                                          Encode a Float as JSON: a number when finite, otherwise Lean's standard string sentinel.

                                                          Instances For

                                                            Decode a Float from JSON.

                                                            We accept JSON numbers, Lean's non-finite string sentinels, and numeric strings for backward compatibility with generated artifacts.

                                                            Instances For

                                                              Encode natural-number arrays used by logs and lightweight runtime artifacts.

                                                              Instances For

                                                                Decode natural-number arrays with field-aware errors.

                                                                Instances For

                                                                  Encode string-note arrays used by logs and lightweight runtime artifacts.

                                                                  Instances For

                                                                    Decode string-note arrays with field-aware errors.

                                                                    Instances For

                                                                      Encode a float-valued metric series.

                                                                      Instances For

                                                                        Decode a float-valued metric series with field-aware errors.

                                                                        Instances For

                                                                          JSON encoding for one named metric series.

                                                                          Instances For

                                                                            JSON decoding for one named metric series.

                                                                            Instances For

                                                                              JSON encoding for the stable TrainLog artifact.

                                                                              Instances For

                                                                                JSON decoding for the stable TrainLog artifact.

                                                                                Instances For

                                                                                  Write a TrainLog as JSON to disk, creating parent directories if needed.

                                                                                  Instances For

                                                                                    Read a TrainLog from a JSON file.

                                                                                    Instances For

                                                                                      JSON encoding for a classification confusion matrix.

                                                                                      Instances For

                                                                                        JSON decoding for a classification confusion matrix.

                                                                                        Instances For

                                                                                          Write a confusion matrix as JSON to disk, creating parent directories if needed.

                                                                                          Instances For

                                                                                            Read a confusion matrix from a JSON file.

                                                                                            Instances For

                                                                                              Write a TrainLog to this destination. Disabled destinations are a no-op.

                                                                                              Instances For

                                                                                                Write an ExperimentLog to this destination. Disabled destinations are a no-op.

                                                                                                Instances For