TorchLean API

Docs Home Guide Examples Graphs

NN.Widgets.RL.GridWorld

GridWorld Widgets #

This module provides small infoview widgets for TorchLean's Lean-native GridWorld environment (NN.Spec.RL.Envs.GridWorld):

#gridworld_view gw, pos renders the grid, highlighting start, goal, and the current position.
#gridworld_policy_view gw, policy renders a simple arrow policy overlay.
#gridworld_path_view gw, path renders a grid with a rollout path (first-visit indices).
#gridworld_policy_file_view gw, path renders a saved before/after greedy policy snapshot.
#gridworld_path_file_view gw, path renders a saved before/after episode path snapshot.

These widgets do not run training loops; instead, they help you see the state-space objects that RL algorithms manipulate.

Main definitions #

gridworldHtml: base grid renderer with start/goal/current position highlights.
gridworldPolicyHtml: policy overlay using direction arrows.
gridworldPathHtml: path renderer using first-visit indices.
gridworldPolicyDiffHtml / gridworldPathDiffHtml: before/after artifact comparison panels.
#gridworld_*_view: command entry points for interactive use.

Implementation notes #

We keep the rendering intentionally simple (cells + badges), because this tends to stay readable even on narrow infoview layouts.
We parse JSON artifacts inline in command macros so widget files remain self-contained and easy to experiment with in Lean.
We use warnings instead of hard failure for mild schema mismatches; in practice this makes debugging generated artifacts much friendlier.

References #

Sutton and Barto, Reinforcement Learning: An Introduction (2nd ed.), Chapter 3 (GridWorld).
ProofWidgets
Lean community documentation style

Tags #

rl, gridworld, policy, rollout, artifacts, proofwidgets

Renderers #

def NN.Widgets.RL.GridWorld.gridworldHtml {width height : ℕ} (gw : Spec.RL.Envs.GridWorld width height) (pos : Spec.RL.Envs.GridWorld.State width height) :

ProofWidgets.Html

Render a GridWorld state as a grid with start/goal/current highlights.

Instances For

def NN.Widgets.RL.GridWorld.gridworldPolicyHtml {width height : ℕ} (gw : Spec.RL.Envs.GridWorld width height) (π : Spec.RL.Envs.GridWorld.State width height → Spec.RL.Envs.GridWorld.Action) :

ProofWidgets.Html

Render a simple policy overlay π : State → Action on GridWorld.

Instances For

def NN.Widgets.RL.GridWorld.gridworldPolicyDiffHtml {width height : ℕ} (gw : Spec.RL.Envs.GridWorld width height) (diff : Runtime.RL.Artifacts.GridWorld.PolicyDiff) :

ProofWidgets.Html

Render a before/after policy snapshot (loaded from disk) as two GridWorld policy panels.

Instances For

def NN.Widgets.RL.GridWorld.gridworldPathHtml {width height : ℕ} (gw : Spec.RL.Envs.GridWorld width height) (path : Array (Spec.RL.Envs.GridWorld.State width height)) :

ProofWidgets.Html

Render a rollout path (first-visit indices) on GridWorld.

Instances For

def NN.Widgets.RL.GridWorld.gridworldPathDiffHtml {width height : ℕ} (gw : Spec.RL.Envs.GridWorld width height) (diff : Runtime.RL.Artifacts.GridWorld.PathDiff) :

ProofWidgets.Html

Render a before/after episode path snapshot (loaded from disk) as two GridWorld path panels.

Instances For

Commands #

def NN.Widgets.RL.GridWorld.gridworldViewCmd :

Lean.ParserDescr

Render the GridWorld layout (and a highlighted position) in the infoview.

Usage: #gridworld_view gw, gw.start

Instances For

def NN.Widgets.RL.GridWorld.gridworldPolicyViewCmd :

Lean.ParserDescr

Render a greedy-policy map for a GridWorld in the infoview.

Usage: #gridworld_policy_view gw, policy, where policy is a flattened row-major array of action indices (0..3).

Instances For

def NN.Widgets.RL.GridWorld.gridworldPathViewCmd :

Lean.ParserDescr

Render a single episode path as a sequence of positions in the infoview.

Usage: #gridworld_path_view gw, path, where path is an array of (row, col) positions.

Instances For

def NN.Widgets.RL.GridWorld.gridworldPolicyFileViewCmd :

Lean.ParserDescr

Read a saved GridWorld greedy-policy snapshot (before vs after) from JSON and render it.

This is intended for executable examples or training jobs that write artifacts to disk, for example: lake exe torchlean ppo_gridworld.

The JSON schema matches Runtime.RL.Artifacts.GridWorld.PolicyDiff.

Instances For

def NN.Widgets.RL.GridWorld.gridworldPathFileViewCmd :

Lean.ParserDescr

Read a saved GridWorld episode path snapshot (before vs after) from JSON and render it.

This is intended for executable examples or training jobs that write artifacts to disk, for example: lake exe torchlean ppo_gridworld.

The JSON schema matches Runtime.RL.Artifacts.GridWorld.PathDiff.

Instances For

def NN.Widgets.RL.GridWorld.«command#gridworld_view_,_» :

Lean.ParserDescr

Instances For

def NN.Widgets.RL.GridWorld.«command#gridworld_policy_view_,_» :

Lean.ParserDescr

Instances For

def NN.Widgets.RL.GridWorld.«command#gridworld_path_view_,_» :

Lean.ParserDescr

Instances For

def NN.Widgets.RL.GridWorld.«command#gridworld_policy_file_view_,_» :

Lean.ParserDescr

Instances For

def NN.Widgets.RL.GridWorld.«command#gridworld_path_file_view_,_» :

Lean.ParserDescr

Instances For