PPO Model Helpers (API) #
Reusable actor/critic MLP constructors for PPO examples.
These helpers intentionally cover only the neural-network shape. Environment collection, trust boundary checks, advantage computation, and optimizer loops stay in the examples/runtime modules.
Configuration for a simple PPO actor/critic pair over vector observations.
Instances For
@[implicit_reducible]
Instances For
@[reducible, inline]
Instances For
@[reducible, inline]
Instances For
@[reducible, inline]
Instances For
def
NN.API.nn.models.ppoActor
(cfg : PPOActorCriticConfig)
(pfx : Shape)
:
M (Sequential (ppoActorInShape cfg pfx) (ppoActorOutShape cfg pfx))
Actor MLP mapping observations to action logits.
Instances For
def
NN.API.nn.models.ppoCritic
(cfg : PPOActorCriticConfig)
(pfx : Shape)
:
M (Sequential (ppoActorInShape cfg pfx) (ppoCriticOutShape cfg pfx))
Critic MLP mapping observations to a scalar value estimate.