Gradient boosted trees (spec model) #

This is a math/reference specification of gradient boosting using decision trees.

Important caveat:

Many computations here are written in a straightforward, proof-friendly style rather than as a tuned implementation.

References (classical):

CART: Breiman, Friedman, Olshen, Stone, "Classification and Regression Trees", 1984.
Gradient boosting: Friedman, "Greedy Function Approximation: A Gradient Boosting Machine", 2001.
XGBoost: Chen and Guestrin, "XGBoost: A Scalable Tree Boosting System", 2016.
LightGBM: Ke et al., "LightGBM: A Highly Efficient Gradient Boosting Decision Tree", 2017.

Tree representation #

We represent a decision tree as a small inductive datatype:

leaf value stores the prediction for that leaf
split feature threshold left right branches on a single feature

This is deliberately compact: it is easy to interpret (forward pass) and easy to fit with a simple greedy CART-style algorithm (implemented below).

Note on comparisons: The spec layer’s scalar interface (Context α) gives us a decidable > (via Context.decidable_gt) but does not promise a decidable < for every backend. To stay portable, the tree uses the rule:

goRight := (x_feature > threshold)

and goes left otherwise (which matches “≤ threshold” for the usual numeric orders).

source

inductive Spec.TreeNode (α : Type) :

Type

A regression-tree node for the typed GBDT specification.

leaf value stores the prediction for that leaf.
split feature threshold left right branches on a single feature using the rule goRight := (x_feature > threshold).

This keeps the representation small and easy to interpret.

leaf {α : Type} (value : α) : TreeNode α
split {α : Type} (feature : ℕ) (threshold : α) (left right : TreeNode α) : TreeNode α

Instances For

source

def Spec.instInhabitedTreeNode.default {a✝ : Type} [Inhabited a✝] :

TreeNode a✝

Instances For

source

@[implicit_reducible]

instance Spec.instInhabitedTreeNode {a✝ : Type} [Inhabited a✝] :

Inhabited (TreeNode a✝)

source

structure Spec.DecisionTreeSpec (α : Type) (maxDepth : ℕ) :

Type

Decision-tree spec wrapper with an explicit max-depth budget.

The max_depth field is redundant (it matches the type index) but is convenient in downstream code that wants a value-level knob.

root : TreeNode α
root.
max_depth : ℕ
max depth.

Instances For

source

def Spec.instInhabitedDecisionTreeSpec.default {a✝ : Type} [Inhabited a✝] {a✝¹ : ℕ} :

DecisionTreeSpec a✝ a✝¹

Instances For

source

@[implicit_reducible]

instance Spec.instInhabitedDecisionTreeSpec {a✝ : Type} [Inhabited a✝] {a✝¹ : ℕ} :

Inhabited (DecisionTreeSpec a✝ a✝¹)

source

structure Spec.GradientBoostedTreesSpec (α : Type) (nTrees maxDepth : ℕ) :

Type

Gradient boosted tree ensemble (regression-style) specification.

We keep the model as an explicit tensor of trees plus a shrinkage parameter learning_rate and an initial_prediction bias term.

trees : Tensor (DecisionTreeSpec α maxDepth) (Shape.dim nTrees Shape.scalar)
trees.
learning_rate : α
learning rate.
initial_prediction : α
initial prediction.

Instances For

Forward pass #

All forward passes in this file are explicit about the feature dimension nFeatures. Tree depth (maxDepth) limits how many splits a tree may contain; it has nothing to do with how many input features exist.

source

def Spec.decisionTreeForwardSpecN {α : Type} [Context α] {maxDepth nFeatures : ℕ} (tree : DecisionTreeSpec α maxDepth) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) :

Forward pass for a single decision tree on an input vector of nFeatures features.

Instances For

source

def Spec.decisionTreeForwardSpecN.traverse {α : Type} [Context α] {nFeatures : ℕ} (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (node : TreeNode α) :

Instances For

source

def Spec.decisionTreeBatchedForwardSpecN {α : Type} [Context α] {batch maxDepth nFeatures : ℕ} (tree : DecisionTreeSpec α maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) :

Tensor α (Shape.dim batch Shape.scalar)

Batched forward pass for a single decision tree.

Instances For

source

def Spec.gradientBoostedTreesForwardSpec {α : Type} [Context α] {nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) :

Tensor α Shape.scalar

Forward pass for a gradient boosted ensemble on a single input.

This simply accumulates initial_prediction + learning_rate * sum(tree_i(x)).

Instances For

source

@[irreducible]

def Spec.gradientBoostedTreesForwardSpec.accumulate_trees {α : Type} [Context α] {nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (i : ℕ) (acc : α) :

Instances For

source

def Spec.gradientBoostedTreesBatchedForwardSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) :

Tensor α (Shape.dim batch Shape.scalar)

Batched forward pass for a gradient boosted ensemble.

Instances For

source

def Spec.treePredictionGradSpec {α : Type} {maxDepth nFeatures : ℕ} (_tree : DecisionTreeSpec α maxDepth) (_input : Tensor α (Shape.dim nFeatures Shape.scalar)) (grad_output : α) :

"Gradient" w.r.t. a tree's prediction.

Decision trees are piecewise-constant in the inputs, so we do not attempt to define meaningful derivatives through their internal decisions here. For boosting, this convention makes the intended dataflow explicit: gradient information is used to fit subsequent trees, not to differentiate through split predicates.

Instances For

source

def Spec.treeInputGradSpec {α : Type} [Context α] {maxDepth nFeatures : ℕ} (_tree : DecisionTreeSpec α maxDepth) (_input : Tensor α (Shape.dim nFeatures Shape.scalar)) (_grad_output : α) :

Tensor α (Shape.dim nFeatures Shape.scalar)

Approximate gradient w.r.t. input features for a tree.

In this spec we return 0 gradients (trees are treated as non-differentiable).

Instances For

source

def Spec.gradientBoostedTreesZeroInputGradForNondiffTrees {α : Type} [Context α] {nTrees maxDepth nFeatures : ℕ} (_model : GradientBoostedTreesSpec α nTrees maxDepth) (_input : Tensor α (Shape.dim nFeatures Shape.scalar)) (_grad_output : α) :

Tensor α (Shape.dim nFeatures Shape.scalar)

Zero input-gradient convention for the ensemble.

This file treats boosted trees as a classical model: we do not backpropagate through tree structure. Instead, residuals/gradients are used to fit new trees. This helper is intentionally not wired into an OpSpec; callers should not mistake it for a differentiable surrogate.

Instances For

Classical training: CART-style regression trees (MSE) #

Tree-based models here are intended as baselines and reference points. For neural models we implement reverse-mode explicitly; for trees we instead provide a classical (non-gradient) training routine.

This section implements a small greedy CART-like procedure for regression:

choose the split (feature, threshold) that minimizes SSE(left) + SSE(right) (sum of squared errors around each side’s mean)
recurse until depth runs out or the split becomes degenerate

This is deterministic by construction:

thresholds are chosen from the observed feature values
ties are broken by the “first best” encountered during folding

This implementation prioritizes clarity and determinism over performance.

source

structure Spec.RegressionExample {α : Type} (nFeatures : ℕ) :

Type

A single regression training example: feature vector x and scalar target y.

x : Tensor α (Shape.dim nFeatures Shape.scalar)
x.
y : α
y.

Instances For

source

def Spec.GradientBoostedTrees.Internal.listSum {α : Type} [Context α] (xs : List α) :

Sum all elements of a list.

Instances For

source

def Spec.GradientBoostedTrees.Internal.meanOrZero {α : Type} [Context α] (xs : List α) :

Mean of a list, with 0 as a convenient default for the empty list.

Instances For

source

def Spec.GradientBoostedTrees.Internal.sse {α : Type} [Context α] (ys : List α) :

Sum of squared deviations from the mean (SSE).

Instances For

source

def Spec.GradientBoostedTrees.Internal.goesRight {α : Type} [Context α] {nFeatures : ℕ} (feature : Fin nFeatures) (threshold : α) (ex : RegressionExample nFeatures) :

Bool

Decide whether a sample goes to the right branch for a (feature, threshold) split.

Instances For

source

def Spec.GradientBoostedTrees.Internal.partitionBySplit {α : Type} [Context α] {nFeatures : ℕ} (feature : Fin nFeatures) (threshold : α) (xs : List (RegressionExample nFeatures)) :

List (RegressionExample nFeatures) × List (RegressionExample nFeatures)

Partition samples into (left, right) for a given split.

Instances For

source

def Spec.GradientBoostedTrees.Internal.targets {α : Type} {nFeatures : ℕ} (xs : List (RegressionExample nFeatures)) :

List α

Extract the regression targets from a list of examples.

Instances For

source

def Spec.GradientBoostedTrees.Internal.splitScore {α : Type} [Context α] {nFeatures : ℕ} (feature : Fin nFeatures) (threshold : α) (xs : List (RegressionExample nFeatures)) :

Option (α × List (RegressionExample nFeatures) × List (RegressionExample nFeatures))

Score a candidate split by sum of squared errors (SSE).

Returns none for degenerate splits (all samples go to one side).

Instances For

source

def Spec.bestSplit {α : Type} [Context α] {nFeatures : ℕ} (xs : List (RegressionExample nFeatures)) :

Option (Fin nFeatures × α × List (RegressionExample nFeatures) × List (RegressionExample nFeatures) × α)

Find the best split (feature, threshold) by exhaustive search over observed thresholds.

Instances For

source

def Spec.leafValue {α : Type} [Context α] {nFeatures : ℕ} (xs : List (RegressionExample nFeatures)) :

Leaf prediction value for regression: the mean target.

Instances For

source

def Spec.fitRegressionNode {α : Type} [Context α] {nFeatures : ℕ} :

ℕ → List (RegressionExample nFeatures) → TreeNode α

Fit a regression tree by greedy CART-style splitting (MSE/SSE), with a depth budget.

depthLeft counts how many splits we are still allowed to make.

Instances For

source

def Spec.decisionTreeFitRegressionMseSpec {α : Type} [Context α] {batch maxDepth nFeatures : ℕ} (x : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (y : Tensor α (Shape.dim batch Shape.scalar)) :

DecisionTreeSpec α maxDepth

Fit a regression decision tree from a batched dataset.

Instances For

Classical training: CART-style classification trees (Gini impurity) #

For classification we often want the leaf prediction to be a label (e.g. String or Nat), while split thresholds remain numeric. To avoid forcing labels into the numeric scalar type α, we define a separate classifier tree type parameterized by the label type β.

The training algorithm mirrors the regression case:

enumerate candidate thresholds from observed feature values
pick the split that minimizes weighted Gini impurity
recurse until depth runs out or no improvement is possible
leaf prediction is the majority class (deterministic tie-breaking via first occurrence)

Why β is separate from α:

Splits compare numeric features (α), so they need ordering/decidable comparison.
Leaf predictions are usually discrete (β), and we do not want to pretend that labels form a numeric scalar domain.

PyTorch / sklearn analogies:

This is closest in spirit to sklearn.tree.DecisionTreeClassifier with criterion="gini", expressed as a small pure spec.
The boosting semantics (adding many trees sequentially) matches the high-level idea of sklearn.ensemble.GradientBoostingClassifier, but TorchLean does not try to reproduce all of sklearn’s engineering details (regularization knobs, histogram binning, etc.).

source

inductive Spec.ClassifierTreeNode (α β : Type) :

Type

A classifier tree node: numeric splits, label-valued leaves.

leaf {α β : Type} (label : β) : ClassifierTreeNode α β
split {α β : Type} (feature : ℕ) (threshold : α) (left right : ClassifierTreeNode α β) : ClassifierTreeNode α β

Instances For

source

def Spec.instInhabitedClassifierTreeNode.default {a✝ a✝¹ : Type} [Inhabited a✝¹] :

ClassifierTreeNode a✝ a✝¹

Instances For

source

@[implicit_reducible]

instance Spec.instInhabitedClassifierTreeNode {a✝ a✝¹ : Type} [Inhabited a✝¹] :

Inhabited (ClassifierTreeNode a✝ a✝¹)

source

structure Spec.DecisionTreeClassifierSpec (α β : Type) (maxDepth : ℕ) :

Type

Specification wrapper for a classification decision tree (numeric splits, label-valued leaves).

root : ClassifierTreeNode α β
root.
max_depth : ℕ
max depth.

Instances For

source

@[implicit_reducible]

instance Spec.instInhabitedDecisionTreeClassifierSpec {a✝ a✝¹ : Type} [Inhabited a✝¹] {a✝² : ℕ} :

Inhabited (DecisionTreeClassifierSpec a✝ a✝¹ a✝²)

source

def Spec.instInhabitedDecisionTreeClassifierSpec.default {a✝ a✝¹ : Type} [Inhabited a✝¹] {a✝² : ℕ} :

DecisionTreeClassifierSpec a✝ a✝¹ a✝²

Instances For

source

def Spec.decisionTreeClassifyForwardSpecN {α : Type} [Context α] {β : Type} {maxDepth nFeatures : ℕ} (tree : DecisionTreeClassifierSpec α β maxDepth) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) :

Forward pass for a classifier decision tree on an input vector of nFeatures features.

Branching convention:

go right iff (x[feature] > threshold),
otherwise go left.

This mirrors a common "≤ goes left / > goes right" convention, but avoids needing a decidable < for every Context α backend.

Instances For

source

def Spec.decisionTreeClassifyForwardSpecN.traverse {α : Type} [Context α] {β : Type} {nFeatures : ℕ} (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (node : ClassifierTreeNode α β) :

Instances For

source

structure Spec.ClassificationExample {α : Type} (nFeatures : ℕ) (β : Type) :

Type

A single classification training example: feature vector x and label y.

x : Tensor α (Shape.dim nFeatures Shape.scalar)
x.
y : β
y.

Instances For

source

def Spec.GradientBoostedTrees.Internal.countEq {β : Type} [DecidableEq β] (lbl : β) (ys : List β) :

ℕ

Count how many times lbl appears in ys.

Instances For

source

def Spec.majorityLabel {β : Type} [DecidableEq β] [Inhabited β] (ys : List β) :

Majority label with deterministic tie-breaking.

If there is a tie, we keep the earlier winner from the fold. This is intentional: it avoids non-determinism and keeps the spec stable across backends.

Instances For

source

def Spec.GradientBoostedTrees.Internal.gini {α : Type} [Context α] {β : Type} [DecidableEq β] (ys : List β) :

Gini impurity of a multiset of labels.

gini(ys) = 1 - Σ_c p(c)^2 where p(c) is the empirical class frequency.

This is the standard CART impurity used by many tree classifiers.

Instances For

source

def Spec.giniWeighted {α : Type} [Context α] {β : Type} [DecidableEq β] (ys : List β) :

Weighted Gini impurity: |ys| * gini(ys).

Instances For

source

def Spec.classTargets {α : Type} {nFeatures : ℕ} {β : Type} (xs : List (ClassificationExample nFeatures β)) :

List β

Extract the labels from a list of classification examples.

Instances For

source

def Spec.goesRightC {α : Type} [Context α] {nFeatures : ℕ} {β : Type} (feature : Fin nFeatures) (threshold : α) (ex : ClassificationExample nFeatures β) :

Bool

Decide whether a classification sample goes right for a (feature, threshold) split.

Instances For

source

def Spec.partitionBySplitC {α : Type} [Context α] {nFeatures : ℕ} {β : Type} (feature : Fin nFeatures) (threshold : α) (xs : List (ClassificationExample nFeatures β)) :

List (ClassificationExample nFeatures β) × List (ClassificationExample nFeatures β)

Partition classification samples into (left, right) for a candidate split.

Instances For

source

def Spec.GradientBoostedTrees.Internal.splitScoreC {α : Type} [Context α] {nFeatures : ℕ} {β : Type} [DecidableEq β] (feature : Fin nFeatures) (threshold : α) (xs : List (ClassificationExample nFeatures β)) :

Option (α × List (ClassificationExample nFeatures β) × List (ClassificationExample nFeatures β))

Score a candidate classification split (feature, threshold) by weighted Gini impurity.

Returns none when the split is degenerate (one side is empty); otherwise returns the score and the (left, right) partitions.

Instances For

source

def Spec.bestSplitC {α : Type} [Context α] {nFeatures : ℕ} {β : Type} [DecidableEq β] (xs : List (ClassificationExample nFeatures β)) :

Option (Fin nFeatures × α × List (ClassificationExample nFeatures β) × List (ClassificationExample nFeatures β) × α)

Find the best classification split (feature, threshold) by exhaustive search.

Thresholds are drawn from the observed feature values in the dataset.

Instances For

source

def Spec.fitClassificationNode {α : Type} [Context α] {nFeatures : ℕ} {β : Type} [DecidableEq β] [Inhabited β] :

ℕ → List (ClassificationExample nFeatures β) → ClassifierTreeNode α β

Fit a classification tree node by greedy CART-style splitting (Gini impurity).

depthLeft counts how many more splits we are allowed to make.

Instances For

source

def Spec.decisionTreeFitClassificationGiniListSpec {α : Type} [Context α] {β : Type} [DecidableEq β] [Inhabited β] {batch maxDepth nFeatures : ℕ} (x : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (y : List β) :

DecisionTreeClassifierSpec α β maxDepth

Fit a classification decision tree (CART-style) using Gini impurity.

Labels are supplied as a list of length batch.

If y is shorter than batch, missing entries use default (so the function remains total).
If y is longer than batch, extra labels are ignored.

This “list-based labels” API is meant for demos and small experiments. If you have labels already as a tensor or an index type, it is usually better to convert them to a list explicitly at the boundary of your program so the conversion is visible.

Instances For

source

def Spec.gbtMseLossSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) :

Tensor α Shape.scalar

Mean squared error (MSE) loss for regression, reduced to a scalar by averaging over the batch.

Instances For

source

def Spec.gbtBinaryCrossentropyLossSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) :

Tensor α Shape.scalar

Binary cross-entropy loss for classification (with a sigmoid), reduced to a scalar by averaging.

This is a direct probability-space loss helper. For numerically sensitive classification pipelines, prefer a stable "BCE with logits" implementation in the runtime/training layer.

Instances For

source

def Spec.gbtMseGradSpec {α : Type} [Context α] {batch : ℕ} (predictions target : Tensor α (Shape.dim batch Shape.scalar)) :

Tensor α (Shape.dim batch Shape.scalar)

Gradient of MSE loss w.r.t. predictions (elementwise).

Instances For

source

def Spec.gbtBinaryCrossentropyGradSpec {α : Type} [Context α] {batch : ℕ} (predictions target : Tensor α (Shape.dim batch Shape.scalar)) :

Tensor α (Shape.dim batch Shape.scalar)

Gradient of sigmoid binary cross-entropy loss w.r.t. predictions (elementwise).

Instances For

source

def Spec.computeResidualsSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) :

Tensor α (Shape.dim batch Shape.scalar)

Residual computation for gradient boosting.

For squared-error regression, the residual is target - prediction.

Instances For

source

def Spec.gradientBoostedTreesTrainStepSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (new_tree : DecisionTreeSpec α maxDepth) (h : batch ≠ 0) :

Tensor α Shape.scalar × GradientBoostedTreesSpec α (nTrees + 1) maxDepth

One gradient-boosting "add a tree" step, given a pre-fit new_tree.

This returns the current loss and the updated model with new_tree appended. The residuals computed here are illustrative; the "fit a tree to residuals" variant below is usually the more self-contained baseline.

Instances For

Gradient boosting: a "fit-one-more-tree" step #

The original gradient_boosted_trees_train_step_spec expects a pre-fit new_tree. For a more complete baseline, we also provide a deterministic step that fits that tree to the residuals.

source

def Spec.gradientBoostedTreesTrainStepFitSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) :

Tensor α Shape.scalar × GradientBoostedTreesSpec α (nTrees + 1) maxDepth

Fit a new tree to residuals and append it to the ensemble.

Instances For

source

def Spec.GradientBoostedTrees.Internal.incrFeature {α : Type} [Context α] {nFeatures : ℕ} (acc : Tensor α (Shape.dim nFeatures Shape.scalar)) (featureIdx : ℕ) :

Tensor α (Shape.dim nFeatures Shape.scalar)

Increment a single feature counter by 1 inside a length-nFeatures vector.

This is used by the split-count feature-importance computation below.

Instances For

source

def Spec.treeFeatureCounts {α : Type} [Context α] {nFeatures : ℕ} :

TreeNode α → Tensor α (Shape.dim nFeatures Shape.scalar) → Tensor α (Shape.dim nFeatures Shape.scalar)

Count how many times each feature index appears in split nodes of a tree.

This mirrors a very common "split count" importance heuristic.

Instances For

source

def Spec.computeFeatureImportanceSpec {α : Type} [Context α] {nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) :

Tensor α (Shape.dim nFeatures Shape.scalar)

Simple split-count feature importance for an ensemble.

This mirrors the common "how often was a feature used in a split?" heuristic. It is not the same as gain-based importance in XGBoost/LightGBM, but it is deterministic and easy to interpret.

Instances For

source

@[irreducible]

def Spec.computeFeatureImportanceSpec.accumulate_importance {α : Type} [Context α] {nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (i : ℕ) (acc : Tensor α (Shape.dim nFeatures Shape.scalar)) :

Tensor α (Shape.dim nFeatures Shape.scalar)

Instances For

source

def Spec.gbtRSquaredSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) :

Tensor α Shape.scalar

Coefficient of determination (R^2) for regression.

This uses the standard formula 1 - ss_res / ss_tot, written as (ss_tot - ss_res) / ss_tot to avoid an explicit 1 - ... when working in an abstract scalar context.

Instances For

source

def Spec.gbtMaeSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) :

Tensor α Shape.scalar

Mean absolute error (MAE) for regression.

Instances For

source

def Spec.gbtRmseSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) :

Tensor α Shape.scalar

Root mean squared error (RMSE) for regression.

Instances For

source

def Spec.earlyStoppingCheckSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (validation_input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (validation_target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) (_patience : ℕ) (min_delta : α) :

Bool

Loss-margin early-stopping predicate for gradient boosting.

This compares a training loss and validation loss with a margin min_delta. The caller is responsible for tracking the patience counter; this predicate only checks one train/validation loss pair.

Instances For

source

def Spec.adjustLearningRateSpec {α : Type} {_nTrees maxDepth : ℕ} (model : GradientBoostedTreesSpec α _nTrees maxDepth) (new_rate : α) :

GradientBoostedTreesSpec α _nTrees maxDepth

Adjust the ensemble learning rate (shrinkage) while keeping the same trees.

Instances For

source

def Spec.prefixSubsampleDataSpec {α : Type} [Context α] {batch newBatch nFeatures : ℕ} (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (_subsample_ratio : α) (_h_ratio : _subsample_ratio > 0 ∧ _subsample_ratio ≤ 1) (h_new_batch : newBatch ≤ batch) :

Tensor α (Shape.dim newBatch (Shape.dim nFeatures Shape.scalar)) × Tensor α (Shape.dim newBatch Shape.scalar)

Deterministic prefix selection used as a proof-friendly stand-in for stochastic subsampling.

Real stochastic GBDT implementations sample rows using randomness. This helper instead takes the first newBatch rows and uses h_new_batch to make that access total, so it is deterministic and does not silently pad with zeros.

Instances For

source

def Spec.xgboostSquaredErrorObjectiveSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch ≠ 0) (lambda _gamma : α) :

Tensor α Shape.scalar

XGBoost-style squared-error proxy with an L2-shaped scalar penalty.

This objective is a typed loss for an already-materialized ensemble. It is not a full XGBoost split-gain objective; tree-builder policies such as histogram binning and split search are represented elsewhere by the tree-fitting routines.

Instances For

source

def Spec.lightgbmSquaredErrorObjectiveSpec {α : Type} [Context α] {batch nTrees maxDepth nFeatures : ℕ} (model : GradientBoostedTreesSpec α nTrees maxDepth) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (lambda_l1 lambda_l2 : α) (h : batch ≠ 0) :

Tensor α Shape.scalar

LightGBM-style squared-error proxy with L1/L2-shaped scalar penalties.

This objective is deterministic because it operates on a fixed ensemble and batch. It records the loss shape used by examples rather than the full LightGBM histogram/split objective.

Instances For

TorchLean API

NN.Spec.Models.GradientBoostedTrees

Gradient boosted trees (spec model) #

Tree representation #

Forward pass #

Classical training: CART-style regression trees (MSE) #

Classical training: CART-style classification trees (Gini impurity) #

Gradient boosting: a "fit-one-more-tree" step #