TorchLean API

Docs Home Guide Examples Graphs

NN.Spec.Core.Shape

Shapes (`Spec.Shape`) #

Shape is the type-level “shape descriptor” for tensors in the spec layer.

TorchLean uses shape-indexed tensors:

Tensor α s

so Shape is how we encode the structure of s in a way Lean can use for both computation and proofs.

Representation #

Shape is an inductive tree:

.scalar
.dim n s (a length-n dimension whose entries have shape s)

This matches the tensor definition in NN/Spec/Core/Tensor/Core.lean.

Common utilities #

Shape.size : Shape → Nat is the total number of scalar elements (“numel”).
Shape.toList : Shape → List Nat is a convenient runtime view used by front-ends and bridges.

PyTorch analogy:

Shape.toList s corresponds to tensor.shape (a tuple of dimensions).
Shape.rank s corresponds to tensor.ndim.
Shape.size s corresponds to tensor.numel().

Broadcasting and axes #

Broadcasting is encoded via CanBroadcastTo / BroadcastTo.

This is an intentionally asymmetric relation ("broadcast s1 to s2"), because most tensor code is naturally written by choosing the output shape and requiring each input to broadcast to it.

The typeclass wrapper BroadcastTo keeps higher-level specs readable: in many cases Lean can infer the broadcast evidence automatically, so call sites don’t have to manually thread proofs around.

It also defines axis-validity helpers (valid_axis) and a well_formed predicate for “all dimensions are positive”, which is useful when you want to rule out degenerate cases in proofs.

We represent shapes as an inductive tree instead of a bare List Nat because:

it matches the tensor representation (Tensor α s) structurally, so many definitions are simple structural recursion,
it keeps "scalar vs dim" cases explicit (important for proofs),
it gives definitional equalities that are friendlier than lists in many places.

inductive Spec.Shape :

Tensor shape descriptor used to index spec-level tensors (Spec.Tensor α s).

Shape is an outermost-first tree:

.scalar for a scalar,
.dim n s for a length-n dimension whose entries have shape s.

scalar : Shape
dim : ℕ → Shape → Shape

Instances For

def Spec.instDecidableEqShape.decEq (x✝ x✝¹ : Shape) :

Decidable (x✝ = x✝¹)

Instances For

@[implicit_reducible]

instance Spec.instDecidableEqShape :

DecidableEq Shape

@[implicit_reducible]

instance Spec.instReprShape :

def Spec.instReprShape.repr :

Shape → ℕ → Std.Format

Instances For

def Spec.Shape.ofList :

List ℕ → Shape

Build a shape from a list of dimensions (outermost first).

Instances For

def Spec.Shape.nodupBool (xs : List ℕ) :

Internal helper: check that a list of axis indices is duplicate-free.

Instances For

def Spec.Shape.getDim! (xs : List ℕ) (i : ℕ) :

Internal helper: get the i-th entry (0-based) from a list of dimensions, defaulting to 0.

Instances For

def Spec.Shape.pretty (s : Shape) :

Pretty-print a Shape for debugging / logs.

Instances For

def Spec.Shape.swapAdjacentAtDepth (s : Shape) (depth : ℕ) :

Swap two adjacent dimensions at a given depth (0‑based from the outermost).

Instances For

theorem Spec.Shape.swapAdjacentAtDepth_involutive (s : Shape) (depth : ℕ) :

(s.swapAdjacentAtDepth depth).swapAdjacentAtDepth depth = s

Swapping adjacent dims at depth depth twice returns the original shape.

def Spec.Shape.appendDim (s : Shape) (n : ℕ) :

Append a new innermost dimension.

Instances For

def Spec.Shape.size :

Total number of scalar elements (a.k.a. “numel”).

Instances For

theorem Spec.Shape.size_dim_mul (a b : ℕ) (s : Shape) :

(dim a (dim b s)).size = a * b * s.size

size for a 2D shape factors as a * b * size s.

theorem Spec.Shape.size_appendDim (s : Shape) (n : ℕ) :

(s.appendDim n).size = s.size * n

appendDim multiplies the number of scalar elements by the appended dimension.

This lemma is the standard justification for reshape tricks where we:

treat a tensor of shape s.appendDim n as a matrix of shape (size s) × n, or
append an extra singleton dimension (n = 1) without changing size.

theorem Spec.Shape.size_eq_of_dModel_eq_numHeads_mul_headDim (seqLen numHeads dModel headDim : ℕ) (h : dModel = numHeads * headDim) :

(dim seqLen (dim dModel scalar)).size = (dim numHeads (dim seqLen (dim headDim scalar))).size

Shape-size identity used in Transformer attention reshapes.

If dModel = numHeads * headDim, then: (seqLen × dModel) has the same size as (numHeads × seqLen × headDim).

def Spec.Shape.dimSize :

Size of the outermost dimension (or 1 for scalar).

Instances For

def Spec.Shape.innerDimSize :

Size of the innermost dimension (or 1 for scalar).

Instances For

def Spec.Shape.toList :

Shape → List ℕ

Convert to a list of dimensions (outermost first).

Instances For

@[simp]

theorem Spec.Shape.ofList_toList (s : Shape) :

ofList s.toList = s

ofList is a left inverse of toList.

@[simp]

theorem Spec.Shape.toList_ofList (xs : List ℕ) :

(ofList xs).toList = xs

toList is a right inverse of ofList.

def Spec.Shape.toArray (s : Shape) :

Convert to an array of dimensions (outermost first).

Instances For

def Spec.Shape.areEqual :

Shape → Shape → Bool

Boolean equality test for shapes (structural).

Instances For

@[implicit_reducible]

instance Spec.Shape.instBEq :

BEq Shape uses the explicit structural boolean test Shape.areEqual.

@[implicit_reducible]

instance Spec.Shape.instInhabited :

Inhabited Shape

Default shape is scalar.

def Spec.Shape.isMatrix :

Shape → Option (ℕ × ℕ)

Check if shape is a matrix (m × n).

Instances For

def Spec.Shape.isVector :

Shape → Option ℕ

Check if shape is a vector (n).

Instances For

def Spec.Shape.isScalar :

Check if shape is scalar.

Instances For

def Spec.Shape.getDim :

Shape → ℕ → Option ℕ

Get dimension at index i (0‑based), or none if out of bounds.

Instances For

Typeclass-friendly broadcasting (`BroadcastTo`) #

The CanBroadcastTo relation is asymmetric (“broadcast s₁ to s₂”), matching how most operations are written: we pick a target shape and require each operand to broadcast to it.

The BroadcastTo wrapper lets Lean search for a broadcast proof automatically, which is convenient for higher-level specs (layers/models) where the broadcasting details are not the point.

PyTorch analogy:

PyTorch broadcasting aligns shapes from the trailing dimensions by implicitly prepending 1s to the shorter shape.
Our Shape is an outermost-first tree, so the corresponding operation is expand_dims: it inserts leading/outer dimensions to reach the target rank (this is the "prepend 1s" step).
dim_1_to_n corresponds to PyTorch's "dimension 1 can expand to n" rule.

inductive Spec.Shape.CanBroadcastTo :

Shape → Shape → Type

Evidence that shape s₁ can be broadcast to shape s₂ (PyTorch-style broadcasting).

scalar_to_any (s : Shape) : scalar.CanBroadcastTo s
dim_eq {n : ℕ} {s₁ s₂ : Shape} (tail : s₁.CanBroadcastTo s₂) : (dim n s₁).CanBroadcastTo (dim n s₂)
dim_1_to_n {n : ℕ} {s₁ s₂ : Shape} (tail : s₁.CanBroadcastTo s₂) : (dim 1 s₁).CanBroadcastTo (dim n s₂)
expand_dims {n : ℕ} {s₁ s₂ : Shape} (tail : s₁.CanBroadcastTo s₂) : s₁.CanBroadcastTo (dim n s₂)

Instances For

def Spec.Shape.instReprCanBroadcastTo.repr {a✝ a✝¹ : Shape} :

a✝.CanBroadcastTo a✝¹ → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance Spec.Shape.instReprCanBroadcastTo {a✝ a✝¹ : Shape} :

Repr (a✝.CanBroadcastTo a✝¹)

class Spec.Shape.BroadcastTo (s₁ s₂ : Shape) :

Typeclass wrapper for CanBroadcastTo so broadcast proofs can be inferred.

proof : s₁.CanBroadcastTo s₂

Instances

@[implicit_reducible]

instance Spec.Shape.broadcastToScalarLeft (s : Shape) :

scalar.BroadcastTo s

Scalar broadcasts to any shape (analogue of "prepend 1s and expand").

@[implicit_reducible]

instance Spec.Shape.broadcastToDimEq {n : ℕ} {s₁ s₂ : Shape} [bc : s₁.BroadcastTo s₂] :

(dim n s₁).BroadcastTo (dim n s₂)

Broadcasting preserves equal leading dimensions when the tails broadcast.

@[implicit_reducible]

instance Spec.Shape.broadcastToDim1ToN {n : ℕ} {s₁ s₂ : Shape} [bc : s₁.BroadcastTo s₂] :

(dim 1 s₁).BroadcastTo (dim n s₂)

Dimension 1 can broadcast to any n (PyTorch's main broadcast rule).

@[implicit_reducible]

instance Spec.Shape.broadcastToExpandDims {n : ℕ} {s₁ s₂ : Shape} [bc : s₁.BroadcastTo s₂] :

s₁.BroadcastTo (dim n s₂)

Prepend an outer dimension (the "expand_dims" step used to align ranks).

def Spec.Shape.isValidReshape (s₁ s₂ : Shape) :

true iff two shapes have the same number of elements.

Instances For

def Spec.Shape.rank :

Rank = number of dimensions (scalar has rank 0).

Instances For

Friendly aliases (PyTorch-style) #

We keep the canonical names (toList, rank, size, well_formed) because they show up throughout the spec/proof code.

For docs and examples, these aliases read more like PyTorch.

@[reducible, inline]

abbrev Spec.Shape.dims (s : Shape) :

PyTorch-style name for Shape.toList.

Instances For

@[reducible, inline]

abbrev Spec.Shape.ndim (s : Shape) :

PyTorch-style name for Shape.rank.

Instances For

@[reducible, inline]

abbrev Spec.Shape.numel (s : Shape) :

PyTorch-style name for Shape.size ("numel").

Instances For

def Spec.Shape.permute? (s : Shape) (perm : List ℕ) :

Permute axes of a shape using a runtime permutation list (0-based). Returns none if invalid.

Instances For

Axis utilities #

Why these exist:

Reduction ops (reduce_sum, reduce_mean, etc.) need an axis argument.
In executable code we want to reject invalid axes early, but in spec/proof code we want the axis validity to be available as evidence that can be carried through lemmas.

So we provide:

valid_axis axis s : Prop as the core definition, and
valid_axis_inst axis s as a typeclass wrapper so the common cases can be inferred.

PyTorch differences:

PyTorch allows negative axes (e.g. dim=-1); here we use Nat axes only (0-based). A typical translation is: "last axis" = Shape.rank s - 1 (when rank s > 0).

inductive Spec.Shape.reducibleAlong :

ℕ → Shape → Prop

Evidence that reducing along axis is well-defined for a shape.

This is a small helper predicate used to rule out degenerate 0-length dimensions when stating laws about reductions.

head {n : ℕ} {s : Shape} : reducibleAlong 0 (dim (n + 1) s)
tail {n : ℕ} {s : Shape} {k : ℕ} : reducibleAlong k s → reducibleAlong (k + 1) (dim (n + 1) s)

Instances For

@[simp]

theorem Spec.Shape.reducibleAlong_head {n : ℕ} {s : Shape} :

reducibleAlong 0 (dim (n + 1) s)

simp lemma: axis 0 is reducible for any positive outer dimension.

@[simp]

theorem Spec.Shape.reducibleAlong_tail {n : ℕ} {s : Shape} {k : ℕ} (h : reducibleAlong k s) :

reducibleAlong (k + 1) (dim (n + 1) s)

simp lemma: reducibility for inner axis lifts to the next outer axis.

valid_axis axis s means that axis is a valid reduction axis for s.

We use a Prop + typeclass wrapper (valid_axis_inst) so proofs can be synthesized by typeclass resolution in downstream code.

inductive Spec.Shape.valid_axis :

ℕ → Shape → Prop

Axis validity predicate for reduction ops (0-based axis in Nat).

valid_zero {n : ℕ} {s : Shape} : valid_axis 0 (dim (n + 1) s)
valid_succ {n : ℕ} {s : Shape} {k : ℕ} (h : valid_axis k s) : valid_axis (k + 1) (dim (n + 1) s)

Instances For

class Spec.Shape.valid_axis_inst (axis : ℕ) (s : Shape) :

Typeclass wrapper for valid_axis so common axis proofs can be inferred.

proof : valid_axis axis s

Instances

instance Spec.Shape.validAxisInstZero {n : ℕ} {s : Shape} :

valid_axis_inst 0 (dim (n + 1) s)

Instance: axis 0 is valid for any positive outer dimension.

instance Spec.Shape.validAxisInstZeroAlt {n : ℕ} {s : Shape} (h : n ≠ 0) :

valid_axis_inst 0 (dim n s)

Instance: axis 0 is valid for a nonzero outer dimension n.

This is a convenience wrapper that turns n ≠ 0 into the n+1 form expected by valid_axis.

instance Spec.Shape.validAxisInstOne {n1 n2 : ℕ} {s : Shape} (h₁ : n1 ≠ 0) (h₂ : n2 ≠ 0) :

valid_axis_inst 1 (dim n1 (dim n2 s))

Instance: axis 1 is valid for a 2D shape when both outer dims are nonzero.

instance Spec.Shape.validAxisInstSucc {n : ℕ} {s : Shape} {k : ℕ} [inst : valid_axis_inst k s] :

valid_axis_inst (k + 1) (dim (n + 1) s)

Instance: if k is a valid axis for s, then k+1 is a valid axis for .dim (n+1) s.

instance Spec.Shape.validAxisInstZeroAlt2 {n : ℕ} {s : Shape} (h : n > 0) :

valid_axis_inst 0 (dim n s)

Instance: axis 0 is valid if you have a positivity proof n > 0 (converted to n ≠ 0).

theorem Spec.Shape.gt_pos_to_ne_zero {n : ℕ} (h : n > 0) :

n ≠ 0

Helper lemma: a positive natural is not zero.

Well-formedness (`well_formed`) #

well_formed s means "all dimensions are positive".

Why this matters (and why we designed it this way):

Many definitions use Fin n indexing; if n = 0, there is no index and you end up with either vacuous truths or extra cases that obscure the intent of the lemma.
Some common ops become awkward or partial at n = 0. For example, a mean typically divides by the number of elements, so n = 0 needs special-case semantics.
PyTorch does allow zero-sized dimensions, and most ops define a sensible result for them. We intentionally keep that complexity out of the core spec layer because it makes proofs much more case-heavy. When we need zero-dimension tensors, we introduce them with explicit semantics instead of relying on incidental behavior.

This is a pragmatic "make the common case pleasant" choice: proofs and specs are shorter, and runtime checks can still handle edge cases separately.

def Spec.Shape.wellFormed :

well_formed s means "all dimensions of s are positive" (recursively).

Instances For

Size positivity #

If all dimensions of a shape are positive, then the total number of scalar elements is positive.

This is a small but useful bridge lemma: many reductions are only defined for nonempty dimensions, and WellFormed is our standard way of expressing that assumption.

theorem Spec.Shape.size_pos_of_well_formed {s : Shape} :

s.wellFormed → 0 < s.size

If s.well_formed, then Shape.size s > 0.

instance Spec.Shape.validAxisLastInst {s : Shape} (h : s.rank > 0) (hw : s.wellFormed) :

valid_axis_inst (s.rank - 1) s

If rank s > 0 and s is well-formed, then the last axis rank s - 1 is valid.

This powers many "reduce over last dimension" specs where the axis is computed as rank s - 1.

class Spec.Shape.WellFormed (s : Shape) :

Typeclass wrapper for Shape.well_formed.

We use a typeclass (instead of passing a well_formed proof everywhere) because it mirrors how other "side conditions" are handled in the library: call sites stay clean, and instances can be provided locally (e.g. letI : Shape.WellFormed s := ...) when needed.

proof : s.wellFormed

Instances

instance Spec.Shape.instWellFormedScalar :

scalar.WellFormed

Scalars are always well-formed.

instance Spec.Shape.instWellFormedDimOfGtNatOfNat {n : ℕ} {s : Shape} [s.WellFormed] (h : n > 0) :

(dim n s).WellFormed

If s is well-formed and n > 0, then .dim n s is well-formed.

instance Spec.Shape.posDim1Wf {s : Shape} [s.WellFormed] :

(dim 1 s).WellFormed

Convenience instance: .dim 1 s is well-formed when s is.

instance Spec.Shape.posDim2Wf {s : Shape} [s.WellFormed] :

(dim 2 s).WellFormed

Convenience instance: .dim 2 s is well-formed when s is.

instance Spec.Shape.instWellFormedDimOfFactGtNatOfNat {n : ℕ} {s : Shape} [s.WellFormed] [h : Fact (n > 0)] :

(dim n s).WellFormed

If a Fact (n > 0) is in scope, lift it to a Shape.WellFormed (.dim n s) instance.

validAxisLastAuto is a convenience instance for the most common reduction axis: "reduce over the last dimension".

In PyTorch this is dim=-1 (after normalization). Here we stay in Nat, so the last axis is rank s - 1, and we require rank s > 0 plus well-formedness so the proof is meaningful.

instance Spec.Shape.validAxisLastAuto {s : Shape} [h_wf : s.WellFormed] (h : s.rank > 0) :

valid_axis_inst (s.rank - 1) s

Convenience instance: infer valid_axis_inst (rank s - 1) s from WellFormed s and rank s > 0.

Bridge lemma: turn a valid_axis proof into a reducibleAlong proof.

Why both exist:

valid_axis is the semantic "this axis makes sense" predicate used in public APIs.
reducibleAlong is a structurally convenient predicate for recursion over tensor shapes (it lines up with how Tensor.dim is constructed).

This function is the adapter between the two views.

def Spec.Shape.proveReducibleAlong (axis : ℕ) (s : Shape) (h : valid_axis axis s) :

reducibleAlong axis s

Convert a valid_axis proof into a structurally convenient reducibleAlong proof.

Instances For

padLeft n s prepends n singleton dimensions to a shape.

PyTorch analogy: unsqueeze(0) repeated n times (or equivalently viewing a tensor as having extra leading dimensions of size 1). This is also the "prepend 1s" step you see in broadcasting.

def Spec.Shape.padLeft :

ℕ → Shape → Shape

Prepend n leading singleton dimensions (size 1) to a shape.

Instances For

theorem Spec.Shape.padLeft_rank (n : ℕ) (s : Shape) :

(padLeft n s).rank = n + s.rank

padLeft n s increases the rank by exactly n.