Public RL API #

This module exposes the mathematical and algorithmic RL surface under NN.API.rl.*.

Design intent:

keep the public API smaller and easier to browse than the full runtime namespace,
follow the same namespace shape as the rest of NN.API.*,
expose typed RL math while keeping environment/trainer integration separate.

References (background and terminology):

Sutton and Barto, Reinforcement Learning: An Introduction (2nd ed.): http://incompleteideas.net/book/the-book-2nd.html
Puterman, Markov Decision Processes (finite discounted MDPs): https://doi.org/10.1002/9780470316887
Gymnasium API docs (reset/step, terminated vs truncated): https://gymnasium.farama.org/

Differentiable policy-gradient losses over TorchLean backend references.

The pure exports above are algebra over concrete spec tensors. These helpers are the training-time counterpart: they build scalar losses from backend refs, so the same formulas can run through eager or compiled autograd.

Training Logs (Widgets and Examples) #

TorchLean does not aim to be a full “trainer framework”, but many executable examples want to:

evaluate a scalar metric every N updates,
append it to a curve, and
write a small JSON file for widgets (#train_log_file_view).

This namespace re-exports the small, stable log types and JSON IO helpers.

TorchLean API

NN.API.RL.Core

Public RL API #

Training Logs (Widgets and Examples) #