TorchLean API

NN.Runtime.Autograd.Compiled.GraphM.Elementwise

GraphM Elementwise And Scalar Ops #

Arithmetic, activations, scalar reductions, and MSE loss builders for proof-compiled graphs.

JVP vs VJP in this module

Each compiled node stores both:

The .compiled runtime path is primarily exercised via reverse-mode (VJP) and compilation to the eager tape. Basic elementwise/bilinear ops provide real JVP rules, shape-structural ops (for example slice/concat) apply the same transformation to the tangent, and heavier ops should expose named spec-layer JVP helpers before being wired here. Reverse-only ops it must be listed in reverseOnlyJvpOps and call unsupportedJvp rather than returning a silent zero tangent.

Forward-mode coverage is expanded by adding concrete jvp rules next to the corresponding forward and vjp definitions.

def Runtime.Autograd.Compiled.GraphM.add {α Δ : Type} [Add α] [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {s : Spec.Shape} (a b : Var s) :
MWith α Δ Γ (Var s)

Elementwise addition node (y = a + b).

PyTorch comparison: torch.add(a, b).

Instances For
    def Runtime.Autograd.Compiled.GraphM.sub {α Δ : Type} [Sub α] [Add α] [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {s : Spec.Shape} (a b : Var s) :
    MWith α Δ Γ (Var s)

    Elementwise subtraction node (y = a - b).

    PyTorch comparison: torch.sub(a, b).

    Instances For
      def Runtime.Autograd.Compiled.GraphM.mul {α Δ : Type} [Mul α] [Add α] [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {s : Spec.Shape} (a b : Var s) :
      MWith α Δ Γ (Var s)

      Elementwise multiplication node (y = a ⊙ b).

      PyTorch comparison: torch.mul(a, b).

      Instances For
        def Runtime.Autograd.Compiled.GraphM.square {α Δ : Type} [Mul α] [Add α] [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) :
        MWith α Δ Γ (Var s)

        Square x ↦ x ⊙ x.

        Instances For
          def Runtime.Autograd.Compiled.GraphM.scale {α Δ : Type} [Mul α] [Add α] [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) (c : α) :
          MWith α Δ Γ (Var s)

          Scale a tensor by a scalar constant c (y = c * x).

          PyTorch comparison: c * x / torch.mul(x, c).

          Instances For
            def Runtime.Autograd.Compiled.GraphM.abs {α : Type} [Context α] [DecidableEq Spec.Shape] [DecidableRel fun (x1 x2 : α) => x1 > x2] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) :
            MWith α Δ Γ (Var s)

            Elementwise absolute value.

            PyTorch comparison: torch.abs(x).

            Instances For
              def Runtime.Autograd.Compiled.GraphM.sqrt {α : Type} [Context α] [DecidableEq Spec.Shape] [DecidableRel fun (x1 x2 : α) => x1 > x2] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) :
              MWith α Δ Γ (Var s)

              Elementwise square root.

              PyTorch comparison: torch.sqrt(x).

              Instances For
                def Runtime.Autograd.Compiled.GraphM.clamp {α : Type} [Context α] [DecidableEq Spec.Shape] [DecidableRel fun (x1 x2 : α) => x1 > x2] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) (minVal maxVal : α) :
                MWith α Δ Γ (Var s)

                Elementwise clamp to [minVal, maxVal].

                PyTorch comparison: torch.clamp(x, min=minVal, max=maxVal).

                Instances For
                  def Runtime.Autograd.Compiled.GraphM.max {α : Type} [Context α] [DecidableEq Spec.Shape] [DecidableRel fun (x1 x2 : α) => x1 > x2] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (a b : Var s) :
                  MWith α Δ Γ (Var s)

                  Elementwise maximum.

                  At ties we split the gradient equally (0.5 / 0.5), matching the tie-handling documented in the eager tape (NN.Runtime.Autograd.Engine.Core).

                  PyTorch comparison: torch.maximum(a, b).

                  Instances For
                    def Runtime.Autograd.Compiled.GraphM.min {α : Type} [Context α] [DecidableEq Spec.Shape] [DecidableRel fun (x1 x2 : α) => x1 > x2] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (a b : Var s) :
                    MWith α Δ Γ (Var s)

                    Elementwise minimum.

                    At ties we split the gradient equally (0.5 / 0.5).

                    PyTorch comparison: torch.minimum(a, b).

                    Instances For
                      def Runtime.Autograd.Compiled.GraphM.relu {α : Type} [Mul α] [Add α] [Zero α] [Max α] [One α] [LT α] [DecidableRel fun (x1 x2 : α) => x1 > x2] [DecidableEq Spec.Shape] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) :
                      MWith α Δ Γ (Var s)

                      Elementwise ReLU.

                      PyTorch comparison: torch.nn.functional.relu(x).

                      Instances For

                        Elementwise sigmoid. PyTorch comparison: torch.sigmoid(x).

                        Instances For

                          Elementwise tanh. PyTorch comparison: torch.tanh(x).

                          Instances For

                            Softmax along the last axis (recursing over outer dimensions).

                            PyTorch comparison: torch.softmax(x, dim=-1).

                            Instances For

                              Stable log-softmax along the last axis.

                              This is a primitive in the compiled graph, not the composition logsoftmax, so proof/IR execution and eager CUDA share the same PyTorch-style numerical contract.

                              Instances For

                                Elementwise softplus. PyTorch comparison: torch.nn.functional.softplus(x).

                                Instances For

                                  Elementwise exponential. PyTorch comparison: torch.exp(x).

                                  Instances For

                                    Elementwise natural logarithm. PyTorch comparison: torch.log(x).

                                    Instances For

                                      Elementwise reciprocal x ↦ 1/x. PyTorch comparison: torch.reciprocal(x).

                                      Instances For
                                        def Runtime.Autograd.Compiled.GraphM.safeLog {α : Type} [Context α] [DecidableEq Spec.Shape] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (x : Var s) (ε : α := Numbers.epsilon) :
                                        MWith α Δ Γ (Var s)

                                        Elementwise numerically-stable log (uses an internal ε).

                                        PyTorch comparison: commonly written torch.log(x + eps).

                                        Instances For

                                          Reduce-sum over all entries, producing a scalar.

                                          PyTorch comparison: torch.sum(x).

                                          Instances For
                                            def Runtime.Autograd.Compiled.GraphM.mseLoss {α : Type} [Add α] [Sub α] [Mul α] [Div α] [Zero α] [One α] [Coe α] [DecidableEq Spec.Shape] {Δ : Type} {Γ : List Spec.Shape} {s : Spec.Shape} (yhat target : Var s) :

                                            Mean-squared error loss with "mean" reduction, producing a scalar.

                                            PyTorch comparison: torch.nn.functional.mse_loss(yhat, target, reduction=\"mean\").

                                            Instances For

                                              Affine layer y = W x + b in the compiled graph.

                                              PyTorch comparison: torch.nn.functional.linear / torch.nn.Linear.

                                              The JVP is the usual product rule: d(Wx+b) = dW*x + W*dx + db.

                                              Instances For

                                                Matrix multiplication ((m×n) @ (n×p) → (m×p)).

                                                PyTorch comparison: torch.matmul.

                                                The JVP is the bilinear product rule d(A @ B) = dA @ B + A @ dB.

                                                Instances For

                                                  Batched matrix multiplication (batch×m×n with batch×n×p).

                                                  PyTorch comparison: torch.bmm.

                                                  The JVP is the batched bilinear product rule d(A @ B) = dA @ B + A @ dB.

                                                  Instances For

                                                    Concatenate two vectors (dim-0 concat).

                                                    PyTorch comparison: torch.cat([a, b], dim=0) for 1D tensors.

                                                    Instances For
                                                      def Runtime.Autograd.Compiled.GraphM.concatDim0 {α Δ : Type} [Add α] [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {n m : } {s : Spec.Shape} (a : Var (Spec.Shape.dim n s)) (b : Var (Spec.Shape.dim m s)) :
                                                      MWith α Δ Γ (Var (Spec.Shape.dim (n + m) s))

                                                      Concatenate along the leading dimension (dim=0) for tensors of shape .dim n s.

                                                      PyTorch comparison: torch.cat([a, b], dim=0).

                                                      Instances For
                                                        def Runtime.Autograd.Compiled.GraphM.sliceRange0 {α Δ : Type} [Zero α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} {n : } {s : Spec.Shape} (x : Var (Spec.Shape.dim n s)) (start len : ) (h : len + start n) :
                                                        MWith α Δ Γ (Var (Spec.Shape.dim len s))

                                                        Slice a contiguous range along dim=0.

                                                        PyTorch comparison: x[start : start+len] for tensors where the leading dimension is indexed.

                                                        Instances For