Approximating multiplication with a 2-layer ReLU MLP (2D box) #
This file gives a constructive, fully proved approximation result:
on [-M,M]², the function (x₀,x₁) ↦ x₀ * x₁ can be uniformly approximated by a
single-hidden-layer
ReLU MLP on Tensor ℝ (.dim 2 .scalar).
TensorVec specialized to the 2D (rank-2) tensor-vector shape.
Instances For
First coordinate projection x ↦ x0 for TensorVec2.
Instances For
Second coordinate projection x ↦ x1 for TensorVec2.
Instances For
The closed box domain [-M,M] × [-M,M] inside TensorVec2.
Instances For
Concatenate tensors along the leading dimension.
In this file, this is used to append the hidden-unit vectors of two subnetworks.
Instances For
Append two first-layer linear specs by appending their weight and bias tensors.
Instances For
Extract the j-th entry from a 1 × n tensor interpreted as a row matrix.
Instances For
Combine two scalar-output linear specs into one scalar-output spec on an appended hidden layer.
If the appended hidden vector is [z_a; z_b], the resulting output layer computes
γ + α*out_a(z_a) + β*out_b(z_b).
Instances For
Reading the left component from an appended hidden vector.
Reading the right component from an appended hidden vector.
Pointwise behavior of the ReLU activation on tensor-vectors.
Matrix-vector multiplication for a 1 × n matrix produces a single scalar coordinate.
Expand mlp_eval_nd into “bias + sum over hidden units” form.
This is the main normalization lemma used to prove that appendLinearSpec together with
combineOutput implements affine combinations of subnetworks.
Selecting the left block of a linear spec appended via appendLinearSpec.
Selecting the right block of a linear spec appended via appendLinearSpec.
Appending hidden units and wiring the output with combineOutput yields an affine combination.
Concretely, the combined network computes:
γ + α * net_a(x) + β * net_b(x).
Uniform approximation of multiplication on [-M,M]^2 by a single-hidden-layer ReLU MLP.
The construction follows the classical reduction
x*y = ((x+y)^2 - (x-y)^2) / 4, combined with a 1D ReLU approximator for square on [-2M,2M]
that is lifted along the ridge directions wPlus and wMinus.