TorchLean API

NN.API.Models.PPO

PPO Model Helpers (API) #

Reusable actor/critic MLP constructors for PPO examples.

These helpers cover the neural-network shape. Environment collection, trust-boundary checks, advantage computation, and optimizer loops stay in the examples/runtime modules.

Configuration for a simple PPO actor/critic pair over vector observations.

Instances For
    @[reducible, inline]

    Actor input shape: observation vectors with a caller-chosen prefix shape.

    Instances For
      @[reducible, inline]

      Actor output shape: action logits with the same prefix shape.

      Instances For
        @[reducible, inline]

        Critic output shape: one scalar value per prefixed observation.

        Instances For

          Actor MLP mapping observations to action logits.

          Instances For

            Critic MLP mapping observations to a scalar value estimate.

            Instances For