TorchLean API

NN.Widgets.RL.PPO

PPO Rollout Viewer #

This module provides a small infoview widget for visualizing PPO rollouts as curves:

Implementation note: we intentionally reuse TorchLean's generic training-log widget (NN.Widgets.Runtime.Training.trainLogHtml) so we do not duplicate plotting/sparkline code.

References:

Main definitions #

Implementation notes #

References #

Tags #

ppo, gae, rollout, reinforcement-learning, visualization

Lossy conversion from a scalar backend α to Lean's Float for visualization.

This is intentionally a small widget-only typeclass. If you define your own scalar backend for RL, add an instance here (or locally in your project) to enable #ppo_rollout_view.

  • toVizFloat : αFloat
Instances
    def NN.Widgets.RL.PPO.ppoRolloutTrainLog {α : Type} [Context α] [ToVizFloat α] [DecidableEq Shape] {obsShape : Shape} {nActions horizon : } (gamma lam : α) (r : Runtime.RL.PPO.Rollout α obsShape nActions horizon) :

    Build a TrainLog containing reward/return/advantage curves for a fixed-horizon PPO rollout.

    Instances For
      def NN.Widgets.RL.PPO.ppoRolloutHtml {α : Type} [Context α] [ToVizFloat α] [DecidableEq Shape] {obsShape : Shape} {nActions horizon : } (gamma lam : α) (r : Runtime.RL.PPO.Rollout α obsShape nActions horizon) :

      Render a PPO rollout viewer as infoview HTML (reward/return/advantage curves + table).

      Instances For

        Commands #