TorchLean API

NN.Runtime.Autograd.Train.Eval

Evaluation helpers #

These utilities aggregate per-sample or per-batch StepReports into a single mean report. Metrics are matched by name and position.

Metric aggregation #

Add two metric lists pointwise.

We require that names match (same metric in the same position). This keeps aggregation honest and avoids silently averaging unrelated quantities.

Instances For
    def Runtime.Autograd.Train.Eval.scaleMetrics {a : Type} [Mul a] [Coe a] (count : ) (metrics : List (Metric a)) :

    Multiply every metric value by a scalar (used for weighted batch averaging).

    Instances For

      Report sums (for weighted aggregation) #

      An accumulator for averaging StepReports.

      Instead of keeping a list of all reports and reducing at the end, we maintain:

      • count: how many samples contributed,
      • lossSum: the sum of losses (optionally weighted by batch size),
      • metricsSum: a pointwise sum of named metrics.

      This is the same idea as computing streaming averages in a typical PyTorch evaluation loop.

      • count :

        Number of samples represented by this accumulator.

      • lossSum : a

        Sum of losses, already weighted by sample count for batch reports.

      • metricsSum : List (Metric a)

        Pointwise sum of metrics; names must stay aligned across additions.

      Instances For

        Start an accumulator from a single-sample report.

        Instances For

          Start an accumulator from a batch report, weighted by the number of samples in the batch.

          This is the appropriate constructor when evalBatch returns means over the batch, but we want the final mean to weight by the number of items in each batch.

          Instances For

            Combine two accumulators (failing if metric names/lengths mismatch).

            Instances For

              Convert an accumulator to a mean StepReport.

              Instances For

                Dataset evaluation #

                def Runtime.Autograd.Train.Eval.evalList {sample a : Type} [Add a] [Div a] [Coe a] (tag : String) (xs : List sample) (evalSample : sampleResult (StepReport a)) :

                Evaluate a list of samples and average their reports.

                This is the “for sample in dataset: compute report; take mean” pattern.

                Instances For
                  def Runtime.Autograd.Train.Eval.evalDataset {sample a : Type} [Add a] [Div a] [Coe a] (tag : String) (ds : Dataset sample) (evalSample : sampleResult (StepReport a)) :

                  Evaluate a Dataset by converting to a list and calling evalList.

                  Instances For
                    def Runtime.Autograd.Train.Eval.evalBatches {sample a : Type} [Add a] [Mul a] [Div a] [Coe a] (tag : String) (batches : List (List sample)) (evalBatch : List sampleResult (StepReport a)) :

                    Evaluate a list of non-empty batches and compute a weighted mean report.

                    Each batch contributes proportionally to its length (so small last-batches do not distort the average).

                    Instances For
                      def Runtime.Autograd.Train.Eval.evalDatasetBatches {sample a : Type} [Add a] [Mul a] [Div a] [Coe a] (tag : String) (batchSize : ) (ds : Dataset sample) (evalBatch : List sampleResult (StepReport a)) :

                      Batch a dataset and then call evalBatches.

                      Instances For