TorchLean API

NN.Examples.Models.RL.DQNReplay

DQN Replay Mini-Example #

This example runs the runtime pieces used by an off-policy DQN-style update:

  1. construct typed transitions;
  2. insert them into a bounded replay buffer;
  3. sample a minibatch;
  4. evaluate a DQN minibatch loss from caller-provided online/target Q-functions.

It is deliberately compact: the Q-functions are hand-written closures rather than neural networks. That keeps the file focused on the replay/minibatch API. A full trainable DQN example can later swap those closures for compiled TorchLean models and an optimizer step.

Run from the repo root through the maintained example runner:

lake exe torchlean dqn_replay

References:

@[reducible, inline]
Instances For
    @[reducible, inline]
    Instances For

      A compact two-feature observation.

      Instances For

        A second observation used as the next state.

        Instances For

          One typed transition inserted into the replay buffer.

          Instances For

            A second transition, marked terminal, so the sample contains both bootstrap modes.

            Instances For

              Build a replay buffer, sample a minibatch, and compute DQN losses.

              Instances For

                Runner entrypoint used by lake exe torchlean dqn_replay.

                Instances For