TorchLean API

NN.Runtime.RL.Gymnasium.Client

Gymnasium Bridge (Client) #

This file contains the low-level subprocess client for talking to the Python helper script scripts/rl/gymnasium_server.py.

The client is compact and reusable across examples:

We use a small startup handshake (describe) to sanity-check that the external environment's declared spaces match the Lean-side expectations (obsShape, nActions).

References:

Low-level client #

Client is a thin wrapper around IO.Process.Child that:

Standard stdio configuration for the Gymnasium subprocess protocol.

Instances For
    @[reducible, inline]

    Convenience alias for the subprocess type used by the Gymnasium client.

    Instances For
      structure Runtime.RL.Gymnasium.Client (obsShape : Spec.Shape) (nActions : ) :

      Typed handle to a running Gymnasium subprocess environment.

      Client is intentionally small: it only provides request/response JSON helpers and enforces a Lean-side Boundary.Contract on all values received from the external process.

      • child : Child

        The Python subprocess.

      • contract : Boundary.Contract obsShape nActions

        Lean-side contract enforced on every transition.

      Instances For

        Send a request object and return the response object map.

        The expected response shape is: {"ok": true, ...} or {"ok": false, "error": "..."}

        This is deliberately kept in Client.Internal. Public callers should use spawn, reset, stepRaw, close, or the higher-level Session API, so JSON protocol details stay behind the Gymnasium trust boundary and out of the curated API facade.

        Instances For

          Require that a JSON response object contains a given field.

          Instances For
            def Runtime.RL.Gymnasium.Client.spawn {obsShape : Spec.Shape} {nActions : } (serverScript envId : String) (contract : Boundary.Contract obsShape nActions) (makeKwargs : List (String × Lean.Json) := []) :
            IO (Client obsShape nActions)

            Spawn the Gymnasium helper subprocess and perform a small handshake (describe).

            The handshake checks that:

            • the external environment reports the expected n_actions, and
            • the external environment reports an obs_shape matching obsShape.toList.

            If the handshake fails, the subprocess is terminated and an IO.userError is thrown.

            makeKwargs (optional) is a JSON object encoded as a list of fields and passed through to the Python bridge as --make-kwargs <json>, which the server forwards to gym.make(envId, **kwargs). This is useful for environments that require constructor options, e.g. Atari RAM observations: makeKwargs := [("obs_type", .str "ram")].

            Instances For
              def Runtime.RL.Gymnasium.Client.reset {obsShape : Spec.Shape} {nActions : } (g : Client obsShape nActions) (seed? : Option := none) :
              IO (Spec.Tensor Float obsShape)

              Reset the environment and return the initial observation.

              Instances For
                def Runtime.RL.Gymnasium.Client.stepRaw {obsShape : Spec.Shape} {nActions : } (g : Client obsShape nActions) (action : ) :

                Step the environment with a raw action index.

                The action is “raw” because the Python bridge receives a Nat; the checked session layer normally passes action.1 from a Fin nActions. The response is still parsed through the Lean-side observation/reward/done contract before it is returned.

                Instances For
                  def Runtime.RL.Gymnasium.Client.close {obsShape : Spec.Shape} {nActions : } (g : Client obsShape nActions) :

                  Close the subprocess (best-effort).

                  Instances For
                    def Runtime.RL.Gymnasium.Client.withClient {α : Type} {obsShape : Spec.Shape} {nActions : } (serverScript envId : String) (contract : Boundary.Contract obsShape nActions) (k : Client obsShape nActionsIO α) :
                    IO α

                    Spawn a client, run k, and ensure the subprocess is closed even if k throws.

                    Instances For