Home Papers Patents Talks Posts Experience Projects
Back to publications

arXiv

Test-Input Generation for Tensor Programs: What Actually Finds Kernel Bugs

Dipankar Sarkar

arXiv preprint arXiv:2606.27396 Cited by 0

Abstract

Companion paper to the Correctness Illusion. We study what kinds of test inputs actually find bugs in LLM-generated GPU kernels. Existing benchmarks (KernelBench) use uniformly-sampled inputs. We show that op-schema-aware seeded fuzzing finds 4-8x more bugs in less time. We also characterize the types of inputs that catch the most bugs (boundary conditions, large inputs, memory-layout edge cases). The companion paper (arXiv:2606.20128) presents the corpus; this paper presents the methodology.

Companion paper to the Correctness Illusion (arXiv:2606.20128). This paper studies the methodology: what kinds of test inputs actually find bugs in LLM-generated GPU kernels.

Abstract

Existing benchmarks (KernelBench) use uniformly-sampled inputs. We show that op-schema-aware seeded fuzzing finds 4-8x more bugs in less time. We also characterize the input types that catch the most bugs. Published June 2026 on arXiv (2606.27396).

Frequently Asked Questions

What is the methodology paper about?

This is the companion methodology paper to the Correctness Illusion (arXiv:2606.20128). It studies what kinds of test inputs actually find bugs in LLM-generated GPU kernels. Existing benchmarks use uniformly-sampled inputs; we show that op-schema-aware seeded fuzzing finds 4-8x more bugs in less time. We characterize the input types that catch the most bugs: boundary conditions, large inputs, and memory-layout edge cases.

How is this different from the Correctness Illusion paper?

The Correctness Illusion paper (arXiv:2606.20128) presents the empirical finding and the 26-op corpus. This paper (arXiv:2606.27396) presents the methodology — what makes a good test input, how to generate it, and how it compares to existing approaches. They are companion papers, published together in June 2026.

What is op-schema-aware seeded fuzzing?

Op-schema-aware seeded fuzzing is a technique that takes the operator schema (input shape, dtype, constraints) and generates seeds that exercise the boundaries. Unlike random fuzzing, the seeds are deterministic and reproducible. Unlike uniform sampling, the seeds target the bugs the test is most likely to miss. We compare 4 input-generation strategies across 26 ops and show that op-schema-aware seeded fuzzing finds 4-8x more bugs in less time.

Related Content