Gymnasium Bridge (Client) #
This file contains the low-level subprocess client for talking to the Python helper script
scripts/rl/gymnasium_server.py.
The client is compact and reusable across examples:
- spawn a Python subprocess hosting an external Gymnasium environment,
- send one JSON object per line to stdin,
- read one JSON object response per line from stdout,
- validate observations/rewards/flags against a Lean-side trust-boundary contract
(
Runtime.RL.Boundary.Contract).
We use a small startup handshake (describe) to sanity-check that the external environment's
declared spaces match the Lean-side expectations (obsShape, nActions).
References:
- Gymnasium API docs (
reset/step,terminatedvstruncated): https://gymnasium.farama.org/ - The original Gym API paper (background on the env interface): https://arxiv.org/abs/1606.01540
- Gymnasium source repository (implementation reference): https://github.com/Farama-Foundation/Gymnasium
- Trust-boundary rationale and contract definition:
NN.Runtime.RL.Boundary.
Low-level client #
Client is a thin wrapper around IO.Process.Child that:
- sends a JSON object (one line) to stdin,
- reads one JSON object response from stdout,
- checks
"ok": trueand reports"error"strings asIO.userError.
Standard stdio configuration for the Gymnasium subprocess protocol.
Instances For
Convenience alias for the subprocess type used by the Gymnasium client.
Instances For
Typed handle to a running Gymnasium subprocess environment.
Client is intentionally small: it only provides request/response JSON helpers and enforces a
Lean-side Boundary.Contract on all values received from the external process.
- child : Child
The Python subprocess.
- contract : Boundary.Contract obsShape nActions
Lean-side contract enforced on every transition.
Instances For
Send a request object and return the response object map.
The expected response shape is:
{"ok": true, ...} or {"ok": false, "error": "..."}
This is deliberately kept in Client.Internal. Public callers should use spawn, reset,
stepRaw, close, or the higher-level Session API, so JSON protocol details stay behind the
Gymnasium trust boundary and out of the curated API facade.
Instances For
Require that a JSON response object contains a given field.
Instances For
Spawn the Gymnasium helper subprocess and perform a small handshake (describe).
The handshake checks that:
- the external environment reports the expected
n_actions, and - the external environment reports an
obs_shapematchingobsShape.toList.
If the handshake fails, the subprocess is terminated and an IO.userError is thrown.
makeKwargs (optional) is a JSON object encoded as a list of fields and passed through to the
Python bridge as --make-kwargs <json>, which the server forwards to gym.make(envId, **kwargs).
This is useful for environments that require constructor options, e.g. Atari RAM observations:
makeKwargs := [("obs_type", .str "ram")].
Instances For
Reset the environment and return the initial observation.
Instances For
Step the environment with a raw action index.
The action is “raw” because the Python bridge receives a Nat; the checked session layer normally
passes action.1 from a Fin nActions. The response is still parsed through the Lean-side
observation/reward/done contract before it is returned.
Instances For
Close the subprocess (best-effort).
Instances For
Spawn a client, run k, and ensure the subprocess is closed even if k throws.