State-space dynamical systems #
State-space sequence models are recurrent dynamical systems:
h_{t+1} = A_t h_t + B_t x_t,
y_t = C_t h_t + D_t x_t.
The full S4/Mamba family uses structured matrices and input-dependent ("selective") parameters. This file starts with the diagonal/channelwise version because it is:
- the primitive implemented by a compact CUDA selective-scan kernel,
- enough to express the formal affine-scan algebra,
- and a useful educational baseline for examples.
The definitions intentionally reuse DynamicalSystem / DrivenSystem from
NN/Spec/Dynamics/System.lean, so existing notions such as iterate, trajectory,
IsFixedPoint, and isContractive apply immediately.
A diagonal state-space model with elementwise input, state, and output channels.
This is the smallest useful SSM shape: it captures the recurrent memory path used by Mamba/S4 while remaining easy to verify. Richer models can wrap this core with linear projections and gates.
- A : Spec.Tensor α (Spec.Shape.dim dim Spec.Shape.scalar)
Elementwise recurrent multiplier. Stability usually requires
|Aᵢ| < 1. - B : Spec.Tensor α (Spec.Shape.dim dim Spec.Shape.scalar)
Elementwise input-to-state gain.
- C : Spec.Tensor α (Spec.Shape.dim dim Spec.Shape.scalar)
Elementwise state-to-output gain.
- D : Spec.Tensor α (Spec.Shape.dim dim Spec.Shape.scalar)
Elementwise residual / skip gain.
Instances For
One recurrent state update h' = A ⊙ h + B ⊙ x.
Instances For
Instances For
Convert a token vector into the corresponding affine transition for the hidden state.
Instances For
Applying the affine transition generated by x is exactly the SSM state update.
Run a full input sequence and return all hidden states.
Instances For
View a scalar-specialized diagonal SSM recurrence as a DrivenSystem.
DrivenSystem uses TorchLean's default SpecScalar, while the reusable SSM record
above is polymorphic over scalar backends. This adapter is the bridge to the existing dynamics API.