TorchLean API

NN.Examples.BugZoo.IgnoredLabelLoss

BugZoo: ignored labels are a reduction contract #

PyTorch issue #75181 reported CrossEntropyLoss(ignore_index=...) returning nan for an all- ignored target case:

https://github.com/pytorch/pytorch/issues/75181

The formal lesson is not "TorchLean has PyTorch's full label-indexed loss kernel." It is simpler: ignored labels should be represented as an explicit contribution mask, and the empty-active-label reduction policy should be stated in the spec rather than left as backend behavior.

def NN.Examples.BugZoo.IgnoredLabelLoss.labelContribution {α : Type} [Zero α] (active : Bool) (loss : α) :
α

A per-example loss contributes exactly when its label is active.

Instances For
    @[simp]

    Ignored labels contribute no scalar loss.

    @[simp]

    Active labels contribute their ordinary scalar loss.

    def NN.Examples.BugZoo.IgnoredLabelLoss.safeMaskedMean {α : Type} [Context α] (total activeCount : α) :
    α

    One explicit empty-reduction policy: divide by an epsilon-shifted active count.

    Real training code may choose a different policy, such as returning zero for an empty batch. The important thing is that the policy is named and checkable instead of hidden inside a backend loss kernel.

    Instances For
      theorem NN.Examples.BugZoo.IgnoredLabelLoss.safeMaskedMean_uses_epsilon_denominator {α : Type} [Context α] (total activeCount : α) :
      safeMaskedMean total activeCount = total / (activeCount + Numbers.epsilon)

      The denominator policy for safeMaskedMean is visible in the definition.