TorchLean API

NN.Spec.Layers.Dropout

Dropout (deterministic spec) #

Dropout is traditionally randomized: each element is kept with probability keep = 1 - p. In this repository we often want a deterministic spec that still documents the intended meaning, so downstream models can choose explicit inference-time or mask-driven dropout semantics.

We therefore expose two simple, deterministic variants:

How this differs from PyTorch:

Gradients:

def Spec.dropoutInferenceSpec {α : Type} [Context α] {s : Shape} (p : α) (x : Tensor α s) :
Tensor α s

Deterministic "dropout-like" scaling: y = keep * x with keep = 1 - p.

This is not PyTorch's eval() behavior for nn.Dropout (which is the identity under inverted dropout). We keep this around because it is a simple deterministic knob that many specs use.

Instances For
    def Spec.dropoutInferenceBackwardSpec {α : Type} [Context α] {s : Shape} (p : α) (grad_output : Tensor α s) :
    Tensor α s

    Backward/VJP for dropout_inference_spec with respect to x: dL/dx = keep * dL/dy.

    Instances For
      def Spec.dropoutMaskedSpec {α : Type} [Context α] {s : Shape} (p : α) (mask : Tensor Bool s) (x : Tensor α s) :
      Tensor α s

      Deterministic training-style dropout with an explicit mask.

      If mask[i] = true, keep element x[i], otherwise drop it to 0. We use inverted-dropout scaling x / keepSafe with keepSafe = max(1 - p, ε).

      Instances For
        def Spec.dropoutMaskedBackwardSpec {α : Type} [Context α] {s : Shape} (p : α) (mask : Tensor Bool s) (grad_output : Tensor α s) :
        Tensor α s

        Backward/VJP for dropout_masked_spec with respect to x.

        This mirrors the forward: gradients are masked and (in the kept positions) scaled by 1/keepSafe.

        Instances For