TorchLean API

NN.Spec.Models.RandomForest

Random Forest #

This file defines a lightweight random-forest wrapper around a list of decision trees from NN.Spec.Module.DecisionTree, plus a couple of standard aggregation strategies:

This is a small reference implementation intended to be:

Ecosystem analogies:

Random-forest container: a list of trees.

Aggregation (vote/average/etc.) is handled by predict so the container stays generic.

Instances For
    def random_forest.predict {a : Type} (forest : RandomForest a) (decisionFn : StringBool) (aggregateFn : List aa) :
    a

    Function to predict using all trees in the forest and combine results with a given aggregation function. For classification, this would typically be a majority vote. For regression, this would be an average.

    Instances For
      def random_forest.majorityVote {a : Type} [Ord a] [Inhabited a] (predictions : List a) :

      For classification tasks, majority vote aggregation function. Returns the most frequent element in the list of predictions. Works with any type that supports ordering.

      Instances For
        def random_forest.average {α : Type} [Zero α] [Add α] [Div α] [Coe α] (predictions : List α) :
        α

        For regression tasks, average aggregation function.

        Instances For

          Numeric random forest (spec baseline) #

          The RandomForest above wraps the symbolic DecisionTree from NN.Spec.Module.DecisionTree, where splits are keyed by String feature names and an external decisionFn : StringBool decides the branch. That is handy for demos, but it is not something we can “train” without providing feature-value semantics.

          For a more classical baseline, we also provide a numeric random forest built on the Spec.DecisionTreeSpec representation used by gradient_boosted_trees.lean:

          This is still a spec/reference implementation: correctness and readability come first.

          structure random_forest.Numeric.RegressionForestSpec (α : Type) (nTrees maxDepth nFeatures : ) :

          A regression random forest: an ensemble of regression trees averaged at inference time.

          Instances For
            def random_forest.Numeric.regressionForestForwardSpec {α : Type} [Context α] {nTrees maxDepth nFeatures : } (model : RegressionForestSpec α nTrees maxDepth nFeatures) (x : Spec.Tensor α (Spec.Shape.dim nFeatures Spec.Shape.scalar)) :

            Forward pass: average tree predictions.

            This corresponds to RandomForestRegressor.predict (mean over tree outputs).

            Instances For
              def random_forest.Numeric.regressionForestFitRegressionMseSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : } (x : Spec.Tensor α (Spec.Shape.dim batch (Spec.Shape.dim nFeatures Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim batch Spec.Shape.scalar)) (hBatch : batch 0) :
              RegressionForestSpec α nTrees maxDepth nFeatures

              Fit a regression random forest by training each tree on a deterministic “bootstrap” resample.

              Why rotate instead of RNG-based bootstrapping?

              • it keeps the whole spec deterministic and pure (important for proofs/verification)
              • it still exercises the same API shape as bagging: each tree sees a different multiset

              If you want true randomness, treat this as the spec and implement an executable wrapper that supplies a randomized index mapping.

              Instances For

                Classification forest (Gini) #

                This mirrors the regression forest, but uses the classifier-tree type from NN/Spec/Models/GradientBoostedTrees.lean so leaf values can be arbitrary labels (β).

                structure random_forest.Numeric.ClassificationForestSpec (α β : Type) (nTrees maxDepth nFeatures : ) :

                A classification random forest: an ensemble of classifier trees (majority vote).

                Instances For
                  def random_forest.Numeric.classificationForestPredictSpec {α : Type} [Context α] {β : Type} [DecidableEq β] [Inhabited β] {nTrees maxDepth nFeatures : } (model : ClassificationForestSpec α β nTrees maxDepth nFeatures) (x : Spec.Tensor α (Spec.Shape.dim nFeatures Spec.Shape.scalar)) :
                  β

                  Predict by majority vote across trees.

                  Instances For
                    def random_forest.Numeric.classificationForestFitClassificationGiniSpec {α : Type} [Context α] {β : Type} [DecidableEq β] [Inhabited β] {batch nTrees maxDepth nFeatures : } (x : Spec.Tensor α (Spec.Shape.dim batch (Spec.Shape.dim nFeatures Spec.Shape.scalar))) (y : List β) (hBatch : batch 0) :
                    ClassificationForestSpec α β nTrees maxDepth nFeatures

                    Fit a classification forest using deterministic “rotation bootstraps” and Gini-based CART trees.

                    Instances For