RL Trust-Boundary Proofs #
NN.Runtime.RL.Boundary.Core provides executable “trust-boundary” checkers for externally supplied
RL data. Those checkers return Except String ... so runtime code can fail fast with clear error
messages. JSON rollout parsing lives one layer higher in NN.Runtime.RL.Boundary.Json; the proof
bridge here only needs the contract and checker semantics.
For formal reasoning, we also want a Prop-level view of exactly what was checked:
Runtime.RL.Boundary.ObservationHoldsRuntime.RL.Boundary.RewardHoldsRuntime.RL.Boundary.DoneFlagsHoldsRuntime.RL.Boundary.ContractHolds
This module provides the bridge lemmas that turn a successful executable check (= .ok ...) into
the corresponding Prop-level statement.
References:
- Meyer, Object-Oriented Software Construction (2nd ed., 1997): “Design by Contract” as a general specification pattern at software boundaries.
- Gymnasium API docs (reset/step, terminated vs truncated): https://gymnasium.farama.org/
- Sutton and Barto, Reinforcement Learning: An Introduction (2nd ed., 2018), discussion of episodic termination semantics: http://incompleteideas.net/book/the-book-2nd.html
Internal helper lemmas #
These facts are purely about the small executable helpers in Runtime.RL.Boundary.Internal.
Bridge lemmas: executable checks -> Prop contract #
If the executable checker checkDoneFlags succeeds, then the Prop-level done-flag contract holds.
If the executable checker checkObservation succeeds, then the Prop-level observation contract holds.
If the executable checker checkReward succeeds, then the Prop-level reward contract holds.
If the executable checker checkTransitionFin succeeds, then the Prop-level contract holds.
This is the main “checked preconditions” bridge used by downstream RL theorems.
Convenience wrapper: run the executable checker and, on success, return the transition bundled with the Prop-level contract proof.
This is the “checked preconditions” interface for downstream proofs/programs: instead of assuming a contract, you explicitly check it and obtain a usable hypothesis.