TorchLean API

NN.Examples.BugZoo.KVCache

BugZoo: KV-cache contracts #

LLM inference-engine bug reports include cache-shift bugs, RoPE/position mismatches, shape mistakes, and resource/configuration errors. TorchLean does not verify a full serving engine, paged attention allocator, or multi-GPU scheduler. The useful first step is still precise: represent the cache update as a typed tensor operation and prove the append invariant we rely on.

Reference:

This file proves the cache append boundary. A stronger future theorem should connect cached decode to full-sequence attention:

decodeWithCache prefix newToken = fullAttention (prefix ++ [newToken])

under the same mask, RoPE/position encoding, and numeric semantics.

structure NN.Examples.BugZoo.KVCache.Cache (α : Type) (seqLen headDim : ) :

A key/value cache with an explicit sequence length and head dimension.

Instances For

    View one token vector as a length-one sequence.

    Instances For

      Append one token vector to a sequence cache along the time axis.

      Instances For
        def NN.Examples.BugZoo.KVCache.appendKV {α : Type} {seqLen headDim : } (cache : Cache α seqLen headDim) (newKey newValue : Spec.Tensor α (Spec.Shape.dim headDim Spec.Shape.scalar)) :
        Cache α (seqLen + 1) headDim

        Append both key and value vectors to the KV cache.

        Instances For
          theorem NN.Examples.BugZoo.KVCache.appendKV_last_key {α : Type} {seqLen headDim : } (cache : Cache α seqLen headDim) (newKey newValue : Spec.Tensor α (Spec.Shape.dim headDim Spec.Shape.scalar)) :
          Spec.getAtSpec (appendKV cache newKey newValue).keys seqLen, = newKey

          The newly appended key is exactly the final key in the updated cache.

          theorem NN.Examples.BugZoo.KVCache.appendKV_last_value {α : Type} {seqLen headDim : } (cache : Cache α seqLen headDim) (newKey newValue : Spec.Tensor α (Spec.Shape.dim headDim Spec.Shape.scalar)) :
          Spec.getAtSpec (appendKV cache newKey newValue).values seqLen, = newValue

          The newly appended value is exactly the final value in the updated cache.