TorchLean API

NN.Spec.Models.LinearRegression

Linear regression (spec model) #

Defines linear regression as a dot product plus bias (one output):

y = wᵀ x + b

We aim to stay close to PyTorch's mental model:

This file is a spec: it states the math (forward + VJPs) with shapes tracked by the type system. It prioritizes clarity and explicit derivatives over performance, and it does not include the closed-form normal-equations solution.

structure Spec.LinearRegressionSpec (α : Type) (inDim : ) :

Parameters for a single-output linear regression model.

PyTorch analogy: the weights and bias fields correspond to nn.Linear(inDim, 1).weight and nn.Linear(inDim, 1).bias, but with shapes tracked in the tensor type.

Instances For
    def Spec.linearRegressionForwardSpec {α : Type} [Context α] {inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) :

    Forward pass for linear regression: y = wᵀ x + b.

    Instances For
      def Spec.linearRegressionBatchedForwardSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) :

      Batched forward pass, applied independently to each input row.

      Instances For
        def Spec.linearRegressionWeightsDerivSpec {α : Type} [Context α] {inDim : } (input : Tensor α (Shape.dim inDim Shape.scalar)) (grad_output : Tensor α Shape.scalar) :

        VJP contribution for weights: dL/dw = x * (dL/dy) (scalar-times-vector scaling).

        Instances For
          def Spec.linearRegressionBiasDerivSpec {α : Type} {inDim : } (_weights : Tensor α (Shape.dim inDim Shape.scalar)) (grad_output : Tensor α Shape.scalar) (_input : Tensor α (Shape.dim inDim Shape.scalar)) :

          VJP contribution for bias: dL/db = dL/dy.

          Instances For
            def Spec.linearRegressionInputDerivSpec {α : Type} [Context α] {inDim : } (weights : Tensor α (Shape.dim inDim Shape.scalar)) (grad_output : Tensor α Shape.scalar) :

            VJP contribution for input: dL/dx = w * (dL/dy).

            Instances For
              def Spec.linearRegressionBackwardSpec {α : Type} [Context α] {inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) (grad_output : Tensor α Shape.scalar) :

              Full backward pass returning (dW, db, dX) in that order.

              Instances For
                def Spec.linearRegressionBatchedBackwardSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (grad_output : Tensor α (Shape.dim batch Shape.scalar)) (h : batch 0) :

                Batched backward pass.

                This aggregates parameter gradients across the batch (a sum over batch), matching PyTorch's default behavior for loss reductions like "mean" when you subsequently scale appropriately.

                Instances For
                  def Spec.mseLossSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch 0) :

                  Mean Squared Error loss (MSE).

                  PyTorch analogy: F.mse_loss(predictions, target, reduction="mean").

                  Note: the batch ≠ 0 hypothesis avoids dividing by zero.

                  Instances For
                    def Spec.mseLossGradSpec {α : Type} [Context α] {batch : } (predictions target : Tensor α (Shape.dim batch Shape.scalar)) :

                    Gradient of MSE w.r.t. predictions: d/dy (mean (y - t)^2) = (2/batch) * (y - t).

                    This is only meaningful when batch > 0 (callers typically already carry batch ≠ 0).

                    Instances For
                      def Spec.linearRegressionTrainStepSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (learning_rate : α) (h : batch 0) :

                      One gradient-descent training step for linear regression.

                      Instances For

                        OpSpec wrapper for linear regression.

                        This is useful when composing the op in a spec-level AD development.

                        Instances For
                          def Spec.rSquaredSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (h : batch 0) :

                          R-squared (coefficient of determination) for model evaluation.

                          PyTorch analogy: there is no single built-in for R² in core PyTorch; this matches the standard definition 1 - SS_res / SS_tot.

                          Note: if SS_tot = 0 (targets are constant), this divides by zero. Many libraries treat that as a special case; this spec keeps the plain formula.

                          Instances For
                            def Spec.ridgeRegressionForwardSpec {α : Type} [Context α] {inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) (_lambda : α) :

                            Ridge regression forward pass.

                            Regularization changes the objective, not the raw prediction function, so the forward pass is identical to ordinary linear regression.

                            Reference: Hoerl and Kennard, "Ridge Regression: Biased Estimation for Nonorthogonal Problems" (1970). https://doi.org/10.1080/00401706.1970.10488634

                            Instances For
                              def Spec.ridgeLossSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (lambda : α) (h : batch 0) :

                              Ridge loss: MSE plus lambda * ||w||_2^2.

                              Instances For
                                def Spec.ridgeWeightsDerivSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (grad_output : Tensor α (Shape.dim batch Shape.scalar)) (lambda : α) (h : batch 0) :

                                Ridge gradient w.r.t. weights.

                                This is the usual batched gradient plus the derivative of lambda * ||w||_2^2, which contributes 2 * lambda * w.

                                Instances For
                                  def Spec.lassoSoftThresholdSpec {α : Type} [Context α] {inDim : } (weights : Tensor α (Shape.dim inDim Shape.scalar)) (threshold : α) :

                                  Soft-thresholding operator (often written S_λ), used in proximal-gradient updates for L1.

                                  Reference: Tibshirani, "Regression Shrinkage and Selection via the Lasso" (1996). https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

                                  Instances For
                                    def Spec.lassoRegressionForwardSpec {α : Type} [Context α] {inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) (_lambda : α) :

                                    Lasso forward pass (same raw prediction function as ordinary linear regression).

                                    Instances For
                                      def Spec.lassoLossSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (lambda : α) (h : batch 0) :

                                      Lasso loss: MSE plus lambda * ||w||_1.

                                      Instances For
                                        def Spec.elasticNetLossSpec {α : Type} [Context α] {batch inDim : } (model : LinearRegressionSpec α inDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) (target : Tensor α (Shape.dim batch Shape.scalar)) (l1_ratio alpha : α) (h : batch 0) :

                                        Elastic net loss: a convex combination of L1 and L2 penalties.

                                        Reference: Zou and Hastie, "Regularization and Variable Selection via the Elastic Net" (2005). https://doi.org/10.1111/j.1467-9868.2005.00503.x

                                        Instances For

                                          Polynomial features #

                                          Polynomial regression can be expressed as linear regression on a fixed feature expansion φ(x) = [x, x^2, ..., x^degree] (per input coordinate). We keep this as a lightweight helper, then reuse linear_regression_forward_spec on the expanded input.

                                          def Spec.polynomialFeaturesSpec {α : Type} [Context α] {inDim : } (degree : ) (input : Tensor α (Shape.dim inDim Shape.scalar)) :
                                          Tensor α (Shape.dim (inDim * degree) Shape.scalar)

                                          Expand a length-inDim input vector into polynomial features up to degree.

                                          This expansion does not include a constant feature (the model bias already plays that role).

                                          Instances For
                                            def Spec.polynomialFeaturesSpec.expand {α : Type} [Context α] {inDim : } (degree : ) (values : Fin inDimTensor α Shape.scalar) (remaining : ) (acc : List α) :
                                            List α
                                            Instances For
                                              def Spec.polynomialRegressionForwardSpec {α : Type} [Context α] {inDim degree : } (model : LinearRegressionSpec α (inDim * degree)) (input : Tensor α (Shape.dim inDim Shape.scalar)) :

                                              Forward pass for polynomial regression: expand features, then run linear regression.

                                              Instances For