BugZoo: compiler and export semantic mismatches #
DL compiler bugs are especially dangerous because they can be silent: the optimized graph runs and returns a tensor, but its semantics no longer match the source model.
NNSmith is the clean citation for this class. It generates valid neural-network graphs, searches for inputs that avoid floating-point exceptional values, and differentially tests DL compilers. The authors report 72 new bugs across TVM, TensorRT, ONNXRuntime, and PyTorch, with 58 confirmed and 51 fixed:
- Liu et al., “NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers”, ASPLOS 2023. https://doi.org/10.1145/3575693.3575707 https://arxiv.org/abs/2207.13066
FreeFuzz gives the same warning at the framework/API level: mining real usage snippets found confirmed PyTorch/TensorFlow library bugs, including backend- and mode-specific failures:
- Wei et al., “Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source”, ICSE 2022. https://arxiv.org/abs/2201.06589
A 2026 PyTorch-compiler study focuses on the same kind of boundary: silent torch.compile
correctness bugs where compiled models return incorrect outputs without an exception or warning:
- Li et al., “Demystifying the Silence of Correctness Bugs in PyTorch Compiler”, 2026. https://arxiv.org/abs/2604.08720
TorchLean's answer is a semantic boundary. For the supported IR fragment, successful compilation to the executable graph should be justified by a theorem that executable evaluation agrees with the denotational source semantics. External compilers and GPU kernels still need their own conformance evidence; this file spells out the contract shape so that “we tested it once” is not confused with a semantic guarantee.
The full TorchLean compiler-correctness chapter contains the stronger IR-specific theorem. This BugZoo card stays lightweight so that importing the examples chapter does not force every heavy compiler proof to elaborate.
The semantic contract at compiler/export/backend boundaries.
sourceEval is the reference semantics, targetEval is the compiled/exported/backend semantics,
and preserves says the target agrees with the source on every input. This compact structure is not a
replacement for the IR-specific compiler proof; it is the reusable shape of the claim that NNSmith-
style and FreeFuzz-style bugs violate.
- sourceEval : Source → Input → Output
Reference/source semantics.
- targetEval : Target → Input → Output
Target/backend semantics.
- implements : Source → Target → Prop
Relation saying that
targetwas produced from or is meant to implementsource. - preserves {source : Source} {target : Target} : self.implements source target → ∀ (input : Input), self.targetEval target input = self.sourceEval source input
Semantic preservation required for an accepted implementation.
Instances For
Once a backend has the semantic-boundary evidence, every accepted implementation agrees.