Training Logs #
Pure training-log data plus the stable JSON codec used by examples, widgets, and CLI artifacts.
The data structures in this file are deliberately small:
Seriesis one named scalar metric, such as loss or accuracy;Curveis the convenient builder for one metric over time;MetricHistoryis the convenient builder for several aligned metrics; andTrainLogis the persisted/viewed artifact shared by JSON IO and widgets.
For W&B-style workflows, ExperimentLog adds run metadata, config entries, tags, and artifact
references while still lowering to the same stable TrainLog JSON that TorchLean widgets already
know how to render. This gives examples one clean local artifact format and a straightforward bridge
to hosted trackers.
We keep JSON support in the same module as the data model so there is one canonical logging API:
Runtime.Training.TrainLog.writeJson / readJson. Widgets live in
NN.Widgets.Runtime.Training and render these logs in the infoview.
Core Artifact Model #
TorchLean logs are intentionally local-first. The shape mirrors the small useful subset of hosted experiment trackers such as Weights & Biases:
- a run has project/name/id metadata,
- config records hyperparameters and data choices,
- metric history records scalar values over steps,
- artifacts point to files produced by the run.
The data stays simple and serializes to ordinary JSON, so examples can write it without any network service or background process.
A file artifact produced or consumed by a training run.
- name : String
Short artifact name, for example
"checkpoint"or"predictions_csv". - path : System.FilePath
Local path to the artifact. External trackers may use this path when uploading artifacts.
- kind : String
Optional kind, such as
"model","plot","dataset", or"report". - description : String
Optional human-readable description.
Instances For
Run metadata for a local experiment.
This is deliberately close to the mental model of W&B (project, name, group, tags, config),
but it remains a plain Lean value and does not perform network IO.
- project : String
Project or collection name. Examples usually use
"torchlean". - name : String
Display name for this run. Defaults to the resulting
TrainLog.titleif left empty. - runId : String
Stable run identifier, useful when several artifacts belong to one run.
- group : String
Optional group/sweep name.
- config : Array ConfigEntry
String-valued hyperparameters and data/runtime choices.
Instances For
A small multi-series training log (curves + optional notes).
- title : String
Title shown at the top of viewers.
Optional step indices.
If empty, viewers typically use
0..(n-1).One or more metric series (should have compatible lengths).
Free-form notes (hyperparameters, dataset, run id, etc.).
Instances For
Build a two-point loss log from an initial and final scalar loss.
This is useful for any training routine that records a baseline loss at step 0 and another
loss after steps updates. More detailed loops should use Curve or MetricHistory below.
Instances For
A single scalar curve over discrete training steps.
Use this when a training routine records one scalar repeatedly, such as training loss, validation
loss, learning rate, or reward. Convert it to TrainLog for JSON persistence and widgets.
We intentionally store curves as Arrays (not tensors):
- curve lengths are runtime-dependent,
- curves are persisted as JSON for widgets, and
- typed tensors are reserved for fixed-shape model inputs/outputs.
Step indices (e.g. update number).
Scalar metric values aligned with
steps.
Instances For
Multi-metric histories #
Curve is perfect for a single scalar, but many training runs record several aligned metrics:
train loss, validation loss, accuracy, learning rate, reward, and so on. MetricHistory stores
that common table-shaped history and converts it to the stable TrainLog artifact consumed by
JSON IO and widgets.
Construct an empty history from (metric name, color) pairs.
Instances For
Append one row of metric values.
If values.size does not match series.size, the update is ignored. This keeps the function
total and avoids manufacturing malformed logs; callers that want a hard error can check sizes
before calling.
Instances For
Convert a multi-metric history into a stable TrainLog artifact.
Instances For
A W&B-shaped local experiment artifact.
ExperimentLog is the richer authoring object. Use toTrainLog before writing JSON or rendering in
widgets. The conversion stores metadata as structured notes so existing TrainLog consumers keep
working without a second widget protocol.
- run : RunInfo
Run metadata and config.
- history : MetricHistory
Aligned scalar metrics over time.
Files produced or consumed by this run.
Additional free-form notes.
Instances For
Experiment-Tracker Helpers #
Render an artifact reference as a stable note inside a TrainLog.
Instances For
Append one scalar metric. New metric names are added lazily with a default color.
Instances For
Append one aligned row of metrics to an experiment.
Instances For
Attach a local file artifact to the run.
Instances For
Convert a rich experiment object into the stable widget/JSON TrainLog artifact.
Instances For
Logging Destinations #
Examples should make logging explicit. LogDestination.disabled is the local equivalent of
wandb disabled; LogDestination.json path writes the standard TorchLean JSON artifact.
Where a training routine should send its log artifact.
- disabled : LogDestination
Do not write a log artifact. Useful for lightweight tests or CI runs.
- json
(path : System.FilePath)
: LogDestination
Write a local JSON artifact to the given path.
Instances For
Instances For
Is logging enabled for this destination?
Instances For
Return the JSON path when this destination writes one.
Instances For
Return the JSON path, falling back to defaultPath when logging is disabled.
Instances For
Parse a CLI logging value.
Accepted disabled values are false, off, none, no, and disabled. Any other value is
treated as a JSON path.
Instances For
Parse an optional CLI value, using the default JSON path when no value is supplied.
Instances For
JSON Codec #
Training logs are often produced by executable examples and then rendered later by widgets. The codec below is intentionally stable and human-readable:
- finite floats are JSON numbers;
- non-finite floats are string sentinels (
"NaN","Infinity","-Infinity"); - missing optional fields fall back to the same defaults as the Lean structures.
This is not intended to be a high-throughput experiment tracker. It is a small artifact format that keeps examples, tests, widgets, and external tooling speaking the same language.
Primitive arrays #
Encode a Float as JSON: a number when finite, otherwise Lean's standard string sentinel.
Instances For
Encode natural-number arrays used by logs and lightweight runtime artifacts.
Instances For
Encode string-note arrays used by logs and lightweight runtime artifacts.
Instances For
Encode a float-valued metric series.
Instances For
JSON encoding for one named metric series.
Instances For
JSON encoding for the stable TrainLog artifact.
Instances For
Write a TrainLog as JSON to disk, creating parent directories if needed.
Instances For
Read a TrainLog from a JSON file.
Instances For
JSON encoding for a classification confusion matrix.
Instances For
JSON decoding for a classification confusion matrix.
Instances For
Write a confusion matrix as JSON to disk, creating parent directories if needed.
Instances For
Read a confusion matrix from a JSON file.
Instances For
Write a TrainLog to this destination. Disabled destinations are a no-op.
Instances For
Write an ExperimentLog to this destination. Disabled destinations are a no-op.