TorchLean API

NN.API.Models.SelfSupervised

Self-Supervised Model Constructors #

Most SSL machinery belongs in NN.API.ssl: masks, tensor-to-training-sample transforms, and objective-facing helpers should work with any compatible model.

This file keeps architecture-level conveniences. The compact MAE constructor below is useful for examples, but the SSL idea itself is not tied to this model.

ViT-MAE #

Configuration for a compact ViT-MAE image reconstructor.

The input/output contract is MAE-style:

  • input: a masked image tensor, N×C×H×W;
  • output: a flattened reconstruction vector, N×reconDim.

reconDim can be the full image size (C*H*W) or a prefix for faster experiments.

Instances For

    Convert a ViT-MAE configuration into the classifier-style ViT config used by the encoder.

    Instances For
      @[reducible, inline]

      Batched masked-image input shape for the ViT-MAE helper.

      Instances For
        @[reducible, inline]

        Batched reconstruction-vector output shape for the ViT-MAE helper.

        Instances For

          Number of patch tokens produced by the ViT-MAE patch embedding.

          Instances For

            Flattened encoded-token representation size before the MAE decoder head.

            Instances For
              def NN.API.nn.models.vitMaskedAutoencoder (cfg : VitMaeConfig) (h_inC : cfg.inC 0 := by decide) (h_patchH : cfg.patchH 0 := by decide) (h_patchW : cfg.patchW 0 := by decide) (h_seqLen : cfg.seqLen 0 := by decide) (h_dModel : cfg.dModel 0 := by decide) :

              Compact ViT-MAE image reconstructor.

              This is a real image/patch transformer path:

              1. patch embedding by strided convolution,
              2. tokenization to N×numPatches×dModel,
              3. one transformer encoder block,
              4. a linear pixel decoder from encoded patch tokens to a reconstruction vector.

              The masking objective is provided by NN.API.ssl.imagePatchMaeSample, so any image model with this input/output shape can use the same SSL training sample.

              Instances For

                Compact vector masked autoencoder.

                Architecturally this reuses the vector autoencoder body; the self-supervised part is in NN.API.ssl.vectorMaeSample or NN.API.ssl.tensorPrefixMaeSample, which mask the input while keeping the original tensor content as the target.

                Instances For