TorchLean API

NN.Examples.Models.RL.DQNReplay

DQN Replay Mini-Example #

This example runs the runtime pieces used by an off-policy DQN-style update:

  1. construct typed transitions;
  2. insert them into a bounded replay buffer;
  3. sample a minibatch;
  4. evaluate a DQN minibatch loss from caller-provided online/target Q-functions.

The Q-functions are hand-written closures rather than neural networks, so the example stays focused on replay buffers and minibatch losses. A full trainable DQN run can later swap those closures for compiled TorchLean models and an optimizer step.

Run from the repo root through the maintained example runner:

lake exe torchlean dqn_replay

References:

@[reducible, inline]
Instances For
    @[reducible, inline]
    Instances For

      A compact two-feature observation.

      Instances For

        A second observation used as the next state.

        Instances For

          One typed transition inserted into the replay buffer.

          Instances For

            A second transition, marked terminal, so the sample contains both bootstrap modes.

            Instances For

              Build a replay buffer, sample a minibatch, and compute DQN losses.

              Instances For

                Command-line help for the replay-buffer mini-example.

                Instances For

                  Runner entrypoint used by lake exe torchlean dqn_replay.

                  Instances For