Bridging 1D ReLU MLPs to Tensor inputs (ridge functions) #
This file is a first “bridge step” between:
- the constructive 1D ReLU approximation theorem in
universal_approximation.lean, and - nD Tensor inputs
Tensor ℝ (.dim n .scalar)used throughout TorchLean.
What is proved here (fully proved):
- Exact representability of affine maps
x ↦ w⋅x + bby a 2-layer ReLU MLP (width 2), using the identityrelu(u) - relu(-u) = u. - Ridge lifting: any 1D 2-layer ReLU MLP can be lifted to an nD Tensor input via
u = w⋅x + c, by scaling each first-layer weight bywand adjusting biases accordingly.
What is not proved here: the full classical nD universal approximation theorem for ReLU MLPs. That requires substantially more formalization (e.g. piecewise-linear approximation machinery or a functional-analytic Cybenko/Leshno style proof).
Tensor ℝ (.dim n .scalar) viewed as an n-vector of real scalars.
Instances For
Rewrapping a vector by Tensor.dim preserves the underlying coordinate function toVec.
Evaluate a single-hidden-layer ReLU MLP on a tensor input and return the scalar output.
Instances For
Identity relu u - relu (-u) = u, used to represent affine maps exactly with ReLU.
Unfold mlp_forward as linear ∘ relu ∘ linear.
This lemma is used as the standard normalization step in “network algebra” proofs.
Extract the unique entry from row i of an (m×1) tensor interpreted as a matrix.
Instances For
Extract the i-th entry of a vector-shaped tensor.
Instances For
Specialized matrix-vector multiplication when the input is a scalar (dimension 1).
General matrix-vector multiplication for matrixMN and a vector written as Tensor.dim.
This generalizes the 1-row dot-product lemma from universal_approximation.lean to arbitrary m.
First layer for exact affine representability.
Given an affine form u(x) = w ⋅ x + b, this layer outputs [u(x), -u(x)].
Instances For
Second layer for exact affine representability.
With hidden activations [relu(u), relu(-u)], this output layer computes relu(u) - relu(-u) = u.
Instances For
Exact representability of coordinate projections x ↦ x_i by a width-2 ReLU MLP.
Ridge lifting #
Given a 1D MLP (l1,l2) and an affine scalar map u = w⋅x + c, we build an nD MLP whose
pre-activations match the 1D pre-activations at u. This lets you reuse any 1D approximation
result for functions of one affine form (“ridge functions”).
Lift a 1D first-layer spec to an nD first-layer spec along a ridge direction.
Given a scalar 1D first layer that expects input u : ℝ, this constructs an nD first layer that
feeds u = w ⋅ x + c.
Instances For
Lifting lemma: the lifted nD MLP agrees with the 1D MLP evaluated at dot w x + c.