Bridging 1D ReLU MLPs to Tensor inputs (ridge functions) #

This file is a first “bridge step” between:

the constructive 1D ReLU approximation theorem in universal_approximation.lean, and
nD Tensor inputs Tensor ℝ (.dim n .scalar) used throughout TorchLean.

What is proved here (fully proved):

Exact representability of affine maps x ↦ w⋅x + b by a 2-layer ReLU MLP (width 2), using the identity relu(u) - relu(-u) = u.
Ridge lifting: any 1D 2-layer ReLU MLP can be lifted to an nD Tensor input via u = w⋅x + c, by scaling each first-layer weight by w and adjusting biases accordingly.

What is not proved here: the full classical nD universal approximation theorem for ReLU MLPs. That requires substantially more formalization (e.g. piecewise-linear approximation machinery or a functional-analytic Cybenko/Leshno style proof).

source

@[reducible, inline]

abbrev NN.MLTheory.Proofs.ReLUMlpBridge.TensorVec (n : ℕ) :

Type

Tensor ℝ (.dim n .scalar) viewed as an n-vector of real scalars.

Instances For

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.toVec {n : ℕ} (x : TensorVec n) :

Fin n → ℝ

View a TensorVec n as a function Fin n → ℝ via Tensor.dimScalarEquiv.

Instances For

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.toVec_dim_toVec {n : ℕ} (x : TensorVec n) :

toVec (Spec.Tensor.dim fun (j : Fin n) => Spec.Tensor.scalar (toVec x j)) = toVec x

Rewrapping a vector by Tensor.dim preserves the underlying coordinate function toVec.

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.dot {n : ℕ} (w : Fin n → ℝ) (x : TensorVec n) :

ℝ

Dot product w ⋅ x for a weight function w : Fin n → ℝ and x : TensorVec n.

Instances For

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.mlpEvalNd {n hidDim : ℕ} (l1 : Spec.LinearSpec ℝ n hidDim) (l2 : Spec.LinearSpec ℝ hidDim 1) (x : TensorVec n) :

ℝ

Evaluate a single-hidden-layer ReLU MLP on a tensor input and return the scalar output.

Instances For

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.relu_sub_relu_neg (u : ℝ) :

UniversalApproximation.relu u - UniversalApproximation.relu (-u) = u

Identity relu u - relu (-u) = u, used to represent affine maps exactly with ReLU.

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.mlp_forward_eq_linear_relu_linear {n hidDim : ℕ} (l1 : Spec.LinearSpec ℝ n hidDim) (l2 : Spec.LinearSpec ℝ hidDim 1) (x : TensorVec n) :

Examples.mlpForward l1 l2 x = have z1 := Spec.linearSpec l1 x; have a1 := Activation.reluSpec z1; Spec.linearSpec l2 a1

Unfold mlp_forward as linear ∘ relu ∘ linear.

This lemma is used as the standard normalization step in “network algebra” proofs.

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.matDim1Get {m : ℕ} (A : Spec.Tensor ℝ (Spec.Shape.dim m (Spec.Shape.dim 1 Spec.Shape.scalar))) (i : Fin m) :

ℝ

Extract the unique entry from row i of an (m×1) tensor interpreted as a matrix.

Instances For

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.vecGet {m : ℕ} (v : Spec.Tensor ℝ (Spec.Shape.dim m Spec.Shape.scalar)) (i : Fin m) :

ℝ

Extract the i-th entry of a vector-shaped tensor.

Instances For

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.mat_vec_mul_spec_dim1 {m : ℕ} (A : Spec.Tensor ℝ (Spec.Shape.dim m (Spec.Shape.dim 1 Spec.Shape.scalar))) (x : ℝ) :

Spec.matVecMulSpec A (Spec.singleton x) = Spec.Tensor.dim fun (i : Fin m) => Spec.Tensor.scalar (matDim1Get A i * x)

Specialized matrix-vector multiplication when the input is a scalar (dimension 1).

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.mat_vec_mul_spec_matrixMN_vector (m n : ℕ) (c : Fin m → Fin n → ℝ) (v : Fin n → ℝ) :

Spec.matVecMulSpec (Spec.matrixMN m n c) (Spec.Tensor.dim fun (j : Fin n) => Spec.Tensor.scalar (v j)) = Spec.Tensor.dim fun (i : Fin m) => Spec.Tensor.scalar (∑ j : Fin n, c i j * v j)

General matrix-vector multiplication for matrixMN and a vector written as Tensor.dim.

This generalizes the 1-row dot-product lemma from universal_approximation.lean to arbitrary m.

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.affineIdLayer1 {n : ℕ} (w : Fin n → ℝ) (b : ℝ) :

Spec.LinearSpec ℝ n 2

First layer for exact affine representability.

Given an affine form u(x) = w ⋅ x + b, this layer outputs [u(x), -u(x)].

Instances For

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.affineIdLayer2 :

Spec.LinearSpec ℝ 2 1

Second layer for exact affine representability.

With hidden activations [relu(u), relu(-u)], this output layer computes relu(u) - relu(-u) = u.

Instances For

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.stdBasis {n : ℕ} (i : Fin n) :

Fin n → ℝ

Standard basis vector e_i : Fin n → ℝ.

Instances For

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.dot_stdBasis {n : ℕ} (i : Fin n) (x : TensorVec n) :

dot (stdBasis i) x = toVec x i

dot e_i x = x_i for the standard basis stdBasis.

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.mlp_eval_affine_id {n : ℕ} (w : Fin n → ℝ) (b : ℝ) (x : TensorVec n) :

mlpEvalNd (affineIdLayer1 w b) affineIdLayer2 x = dot w x + b

Exact representability of affine maps by a 2-layer ReLU MLP (width 2).

This is the core “bridge lemma” that turns scalar affine forms w ⋅ x + b into MLP evaluations.

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.mlp_eval_coord {n : ℕ} (i : Fin n) (x : TensorVec n) :

mlpEvalNd (affineIdLayer1 (stdBasis i) 0) affineIdLayer2 x = toVec x i

Exact representability of coordinate projections x ↦ x_i by a width-2 ReLU MLP.

Ridge lifting #

Given a 1D MLP (l1,l2) and an affine scalar map u = w⋅x + c, we build an nD MLP whose pre-activations match the 1D pre-activations at u. This lets you reuse any 1D approximation result for functions of one affine form (“ridge functions”).

source

noncomputable def NN.MLTheory.Proofs.ReLUMlpBridge.liftLayer1From1d {n hidDim : ℕ} (l1 : Spec.LinearSpec ℝ 1 hidDim) (w : Fin n → ℝ) (c : ℝ) :

Spec.LinearSpec ℝ n hidDim

Lift a 1D first-layer spec to an nD first-layer spec along a ridge direction.

Given a scalar 1D first layer that expects input u : ℝ, this constructs an nD first layer that feeds u = w ⋅ x + c.

Instances For

source

theorem NN.MLTheory.Proofs.ReLUMlpBridge.mlp_eval_lift_from_1d {n hidDim : ℕ} (l1 : Spec.LinearSpec ℝ 1 hidDim) (l2 : Spec.LinearSpec ℝ hidDim 1) (w : Fin n → ℝ) (c : ℝ) (x : TensorVec n) :

mlpEvalNd (liftLayer1From1d l1 w c) l2 x = UniversalApproximation.mlpEval1d hidDim l1 l2 (dot w x + c)

Lifting lemma: the lifted nD MLP agrees with the 1D MLP evaluated at dot w x + c.

TorchLean API

NN.MLTheory.Proofs.ReLU.Bridge.ReLUMlpBridge

Bridging 1D ReLU MLPs to Tensor inputs (ridge functions) #

Ridge lifting #