Loss functions (spec layer) #

def Spec.huberDerivSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (delta : α := 1) :

Tensor α s

Derivative of huber_spec w.r.t. predicted.

Instances For

source

def Spec.crossEntropySpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (epsilon : α := Numbers.epsilon) :

Cross-entropy between distributions (probabilities).

This is closest to PyTorch when you already have probabilities q (e.g. after a softmax) and a probability target p (e.g. one-hot or label-smoothed), and you want:

CE(p, q) = -mean_i p_i * log(q_i).

PyTorch's F.cross_entropy typically takes logits and does log_softmax + NLLLoss; that is a different API surface than this "probabilities in, scalar out" spec.

Instances For

source

def Spec.crossEntropyDerivSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (epsilon : α := Numbers.epsilon) :

Tensor α s

Derivative of cross_entropy_spec w.r.t. predicted.

Instances For

source

def Spec.crossEntropyLogitsSpec {α : Type} [Context α] {s : Shape} (logits target : Tensor α s) :

Cross-entropy on logits (stable log-softmax form).

This matches the common PyTorch decomposition:

cross_entropy(logits, target) = -mean_i target_i * log_softmax(logits)_i.

Unlike crossEntropySpec, this takes logits and uses Activation.logSoftmaxSpec for numerical stability.

Note: this spec assumes each last-axis target slice is a probability distribution (sums to 1), as in one-hot or label-smoothed targets.

Instances For

source

def Spec.crossEntropyLogitsDerivSpec {α : Type} [Context α] {s : Shape} (logits target : Tensor α s) :

Tensor α s

Derivative of cross_entropy_logits_spec w.r.t. logits.

Instances For

source

def Spec.hingeSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) :

Hinge loss (binary margin loss), elementwise then mean-reduced:

hinge(x, y) = mean_i max(0, 1 - y_i * x_i).

This matches the usual SVM-style hinge loss. (PyTorch exposes similar behavior via margin-style losses such as HingeEmbeddingLoss / MultiMarginLoss, but the exact signature differs.)

Instances For

source

def Spec.hingeDerivSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) :

Tensor α s

Derivative/subgradient of hinge_spec w.r.t. predicted.

Instances For

source

def Spec.poissonSpec {α : Type} [Context α] {s : Shape} (predicted target : Tensor α s) :

Poisson negative log-likelihood (log-input form), elementwise then mean-reduced:

If predicted represents log(rate) and target is a nonnegative count, then (up to an additive constant that does not affect gradients):

loss_i = exp(pred_i) - target_i * pred_i.

This corresponds to PyTorch's PoissonNLLLoss(log_input=true, full=false) at the math level.

Instances For

source

def Spec.poissonDerivSpec {α : Type} [Context α] {s : Shape} (predicted target : Tensor α s) :

Tensor α s

Derivative of poisson_spec w.r.t. predicted.

Instances For

source

def Spec.cosineSimilaritySpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (epsilon : α := Numbers.epsilon) :

Cosine similarity loss: 1 - cos(predicted, target) (reduced-to-scalar).

Instances For

source

def Spec.cosineSimilarityDerivSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (epsilon : α := Numbers.epsilon) :

Tensor α s

Derivative of cosine_similarity_spec w.r.t. predicted.

If cos = (p·t)/(|p||t|) and loss = 1 - cos, then (for nonzero norms):

∂loss/∂p = (p·t) / (|p|^2 |t|) * p - 1/(|p||t|) * t.

We use epsilon to avoid division by zero (similar to common "eps" handling in PyTorch code).

Instances For

source

def Spec.logCoshSpec {α : Type} [Context α] {s : Shape} (predicted target : Tensor α s) :

Log-cosh loss (reduced-to-scalar): log(cosh(predicted - target)).

Instances For

source

def Spec.logCoshDerivSpec {α : Type} [Context α] {s : Shape} (predicted target : Tensor α s) :

Tensor α s

Derivative of log_cosh_spec w.r.t. predicted.

Instances For

source

def Spec.binaryCrossEntropySpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (predicted target : α) (epsilon : α := Numbers.epsilon) :

Binary cross-entropy on scalars (probabilities), with clipping to avoid log(0).

This matches the core formula behind PyTorch's BCELoss when predicted is already a probability (not a logit):

BCE(p, y) = - ( y*log(p) + (1-y)*log(1-p) ).

Assumption: target is in [0, 1]. We do not clip the target; we only clip predicted.

Instances For

source

def Spec.binaryCrossEntropyDerivSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (predicted target : α) (epsilon : α := Numbers.epsilon) :

Derivative of binary_cross_entropy_spec w.r.t. predicted.

Instances For

source

def Spec.binaryCrossEntropyTensorSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (epsilon : α := Numbers.epsilon) :

Tensor BCE (probabilities), elementwise then mean-reduced.

Instances For

source

def Spec.binaryCrossEntropyTensorDerivSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {s : Shape} (predicted target : Tensor α s) (epsilon : α := Numbers.epsilon) :

Tensor α s

Derivative of binary_cross_entropy_tensor_spec w.r.t. predicted.

Instances For

TorchLean API

NN.Spec.Layers.Loss

Loss functions (spec layer) #