GridWorld Widgets #
This module provides small infoview widgets for TorchLean's Lean-native GridWorld environment
(NN.Spec.RL.Envs.GridWorld):
#gridworld_view gw, posrenders the grid, highlightingstart,goal, and the current position.#gridworld_policy_view gw, policyrenders a simple arrow policy overlay.#gridworld_path_view gw, pathrenders a grid with a rollout path (first-visit indices).#gridworld_policy_file_view gw, pathrenders a saved before/after greedy policy snapshot.#gridworld_path_file_view gw, pathrenders a saved before/after episode path snapshot.
These widgets do not run training loops; instead, they help you see the state-space objects that RL algorithms manipulate.
Main definitions #
gridworldHtml: base grid renderer with start/goal/current position highlights.gridworldPolicyHtml: policy overlay using direction arrows.gridworldPathHtml: path renderer using first-visit indices.gridworldPolicyDiffHtml/gridworldPathDiffHtml: before/after artifact comparison panels.#gridworld_*_view: command entry points for interactive use.
Implementation notes #
- We keep the rendering intentionally simple (cells + badges), because this tends to stay readable even on narrow infoview layouts.
- We parse JSON artifacts inline in command macros so widget files remain self-contained and easy to experiment with in Lean.
- We use warnings instead of hard failure for mild schema mismatches; in practice this makes debugging generated artifacts much friendlier.
References #
- Sutton and Barto, Reinforcement Learning: An Introduction (2nd ed.), Chapter 3 (GridWorld).
- ProofWidgets
- Lean community documentation style
Tags #
rl, gridworld, policy, rollout, artifacts, proofwidgets
Renderers #
Render a GridWorld state as a grid with start/goal/current highlights.
Instances For
Render a simple policy overlay π : State → Action on GridWorld.
Instances For
Render a before/after policy snapshot (loaded from disk) as two GridWorld policy panels.
Instances For
Render a rollout path (first-visit indices) on GridWorld.
Instances For
Render a before/after episode path snapshot (loaded from disk) as two GridWorld path panels.
Instances For
Commands #
Render the GridWorld layout (and a highlighted position) in the infoview.
Usage: #gridworld_view gw, gw.start
Instances For
Render a greedy-policy map for a GridWorld in the infoview.
Usage: #gridworld_policy_view gw, policy, where policy is a flattened row-major array of
action indices (0..3).
Instances For
Render a single episode path as a sequence of positions in the infoview.
Usage: #gridworld_path_view gw, path, where path is an array of (row, col) positions.
Instances For
Read a saved GridWorld greedy-policy snapshot (before vs after) from JSON and render it.
This is intended for executable examples or training jobs that write artifacts to disk, for example:
lake exe torchlean ppo_gridworld.
The JSON schema matches Runtime.RL.Artifacts.GridWorld.PolicyDiff.
Instances For
Read a saved GridWorld episode path snapshot (before vs after) from JSON and render it.
This is intended for executable examples or training jobs that write artifacts to disk, for example:
lake exe torchlean ppo_gridworld.
The JSON schema matches Runtime.RL.Artifacts.GridWorld.PathDiff.