TorchLean API

NN.Spec.Models.Knn

k‑Nearest Neighbors (kNN) (spec model) #

This file provides a small kNN classifier/regressor baseline:

We include deterministic tie‑breaking so results are stable (useful for regression tests and formal reasoning).

References:

PyTorch / sklearn analogies:

Model container #

structure Spec.KNN (α β : Type) (n : ) :

A small kNN model container (parameters + stored dataset).

This is a lazy model: inference consults the stored dataset at query time, rather than learning weights.

Instances For

    Neighbor selection #

    The key technical detail here is deterministic tie-breaking: when two points are at exactly the same distance, we prefer the earlier point in the dataset. This makes evaluation stable and keeps formal reasoning about the classifier simpler.

    def Spec.findKNearest (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (knn : KNN α β n) (input : Tensor α (Shape.dim n Shape.scalar)) :

    Find the k nearest neighbors under Euclidean distance.

    PyTorch/sklearn analogy: Euclidean L2 distance is the default for many baseline kNN examples.

    Instances For
      def Spec.findKNearestWithDistance (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (distanceFn : Tensor α (Shape.dim n Shape.scalar)Tensor α (Shape.dim n Shape.scalar)α) (knn : KNN α β n) (input : Tensor α (Shape.dim n Shape.scalar)) :

      Find the k nearest neighbors under a user-provided distance function.

      Notes:

      • The distance value is only used for ranking neighbors. It does not need to satisfy metric axioms, but it should be consistent with "smaller means closer".
      • We keep deterministic tie-breaking via dataset order.
      Instances For

        Classification #

        def Spec.classify (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [BEq β] [Hashable β] [Inhabited β] (knn : KNN α β n) (input : Tensor α (Shape.dim n Shape.scalar)) :
        β

        Majority vote among the neighbors.

        Tie-breaking: if multiple labels have the same maximum count, we prefer the label of the closest neighbor. This is deterministic and matches common "stable mode" implementations.

        Instances For
          def Spec.classifyRBMap (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [Ord β] [Inhabited β] (knn : KNN α β n) (input : Tensor α (Shape.dim n Shape.scalar)) :

          Classification using an RBMap for label counts.

          This variant avoids hashing, at the cost of requiring an Ord β instance.

          Instances For

            Regression #

            def Spec.predict (α : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (knn : KNN α α n) (input : Tensor α (Shape.dim n Shape.scalar)) :
            α

            Unweighted kNN regression: average of the neighbor targets.

            Instances For
              def Spec.predictWeighted (α : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (knn : KNN α α n) (input : Tensor α (Shape.dim n Shape.scalar)) :
              α

              Weighted kNN regression using inverse-distance weights.

              We use weights w_i = 1 / d(x, x_i). If a neighbor is exactly at distance 0 we give it a large weight so it dominates the average.

              This is a common heuristic and matches what many "classic ML" baselines do in practice.

              Instances For

                Helpers #

                def Spec.KNN.fromData (α β : Type) (n k : ) (data : List (Tensor α (Shape.dim n Shape.scalar) × β)) :
                KNN α β n

                Constructor helper (explicit arguments keep elaboration simple in demos).

                Instances For
                  def Spec.batchPredict (α : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (knn : KNN α α n) (inputs : List (Tensor α (Shape.dim n Shape.scalar))) :
                  List α

                  Batch regression: map predict over a list of inputs.

                  Instances For
                    def Spec.batchClassify (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [Hashable β] [Inhabited β] [BEq β] (knn : KNN α β n) (inputs : List (Tensor α (Shape.dim n Shape.scalar))) :
                    List β

                    Batch classification: map classify over a list of inputs.

                    Instances For
                      def Spec.classifyWithDistance (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [BEq β] [Hashable β] [Inhabited β] (distanceFn : Tensor α (Shape.dim n Shape.scalar)Tensor α (Shape.dim n Shape.scalar)α) (knn : KNN α β n) (input : Tensor α (Shape.dim n Shape.scalar)) :
                      β

                      kNN classification using an explicit distance function (for non-Euclidean metrics).

                      Instances For
                        def Spec.classifyWithConfidence (α β : Type) (n : ) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [BEq β] [Hashable β] [Inhabited β] (knn : KNN α β n) (input : Tensor α (Shape.dim n Shape.scalar)) :
                        β × α

                        kNN classification plus a simple confidence score (maxVotes / k).

                        This returns the winning label together with a scalar in [0,1] that measures what fraction of the neighbors agreed with the winner. This is a heuristic, but it is useful for demos and baselines.

                        Instances For