From Finding to Fix: Submitting Security Patches to Open Source Projects

Posted on Wed 13 May 2026 in Thought, vulnerability research and discovery

Finding a bug is the first half. Getting the fix shipped is a different skill set entirely, and almost nobody writes about it.

Most security research ends at the proof of concept. You found the thing, you have a crash, maybe a writeup. What happens next is either a CVE number or silence. The part between "I found this" and "this is merged and users are protected" is mostly undocumented. That gap is where this post lives.


The Gap Nobody Talks About

Trail of Bits put it plainly in a recent post about their OSS security work:

"Bug finding is easy, but triaging, disclosing, and fixing them takes disproportionate time and effort. Each finding still needs a human to confirm the bug, a static or dynamic check to reproduce it, a working proof-of-concept, and a minimal patch. That work is heavy, and right now it falls on the maintainer."

They were describing the view from the maintainer side. This is the view from the researcher side. Their benchmark for submissions: every report includes a PoC, a fix patch, and a regression test. Anything that does not clear that bar does not get sent.

That bar sounds reasonable. It is not trivial to clear. The two QuickJS PRs I submitted after finding bugs using the audit pipeline both cleared it eventually. Neither cleared it on the first try.


What the QuickJS PRs Actually Looked Like

Two bugs found, two PRs. Neither was straightforward.

PR #1468 - the sign truncation

The entry point was one word: max_int to max_uint32 in expand_fast_array. That part was clear from the analysis. Everything downstream was not.

The first CI run failed. UBSan caught a signed integer overflow in js_array_mark, a function that iterates over p->u.array.count (a uint32_t) using int i as the loop variable. The bug I reported exposed a path where count could reach 0x80000000, making i++ overflow a signed int. Same fix applied to js_array_finalizer which had the same loop pattern. Two more loop variable changes on top of the original.

That was not all. expand_fast_array needed an explicit guard rejecting new_len > INT32_MAX before the allocation attempt, because on systems with memory overcommit the 16GB virtual allocation could succeed silently. Without the guard the fix was incomplete for those environments.

The piece that completed it was the fast path fallback in js_array_push. The original condition only checked for unsigned wraparound. It needed a second guard:

if (likely(new_len >= array_len && new_len <= (uint32_t)INT32_MAX)) { /* no overflow and within fast-array bounds */

When new_len hits 0x80000000, the second condition fails, the fast path skips, and execution falls through to the slow path. Without that change the push would bypass the guard entirely and complete silently via the slow array path, no crash, no error, test failing with "expected throw was not thrown." Adding js_uint32(new_len) in place of js_int32(new_len) closed the UBSan hit on the cast. Four places in quickjs.c total, each connected, each necessary.

Then there was a double brace. Adding the guard to expand_fast_array introduced if (new_size < old_size) { { with an extra opening brace. The compiler error cascade from that filled a screen. One character.

Then the test. I wrote it to expect arr.push(4) to throw. It did not throw. The slow array path in QuickJS can handle large array lengths by converting to property-based storage. On some CI environments, the allocation for the oversized fast array succeeds through overcommit. On ASan builds, it fails. Same code, different behavior depending on the allocator. The test had to be rewritten to accept both outcomes: graceful error or successful completion via slow path. The key assertion was not "must throw" but "must not crash."

PR #1467 - the UAF

The UAF fix was mechanically cleaner. One line in set_array_length, zeroing freed slots before decrementing count, matching the pattern the original PR had already applied to delete_property.

The friction here was infrastructure, and I did not run into it on the UAF. I ran into it on the OOB PR first. A maintainer left a comment pointing out that the test file needed to be named bugXXXX.js where XXXX is the PR number. Wasn't documented anywhere obvious. I found the naming convention through that comment, not by reading the directory.

Small things. They gate whether a PR gets merged or closed, and you learn them once per project.


The Standards You Don't Know Until You Hit Them

The fix being correct is necessary but not sufficient. Projects have standards that are invisible until you violate them.

For the QuickJS PRs, the ones that mattered:

CI gates with sanitizers. The project runs UBSan and ASan. If your fix exposes a pre-existing signed integer overflow in an adjacent function, the CI fails on your PR even though you did not introduce the issue. You own it anyway. Understanding what the CI actually runs before submitting saves a round trip.

Test file naming. bugXXXX.js not my_poc.js or regression_test.js. One PR, one convention, not written down. Reading the existing tests directory is faster than finding out through review feedback.

Test assertions that match platform behavior. What crashes on an ASan build may succeed silently on a release build with memory overcommit. A test that asserts "must throw" fails on the release CI while passing locally. The correct assertion was "no crash," not "specific error type." The distinction matters and requires understanding the call path.

Two separate fixes in one PR. The sign truncation PR ended up touching three things: the max_int to max_uint32 change, the int i to uint32_t i loop variable in js_array_mark, and the new_len > INT32_MAX guard in expand_fast_array. Each was necessary. Each was related. Keeping them together was the right call, but it meant understanding the causal chain before writing the commit message.

All checks have to pass, not just the ones you ran. QuickJS CI runs across a matrix of systems, compilers, and configurations. The same PR that passes locally on macOS with clang can fail on Ubuntu with gcc, or pass on a release build but fail under UBSan, or pass under ASan but fail on a platform with a different allocator. Every gate in that matrix has to clear. That is not process overhead. That is what it means for a fix to actually be correct across the real distribution of environments where the software runs.


What the Harness Helped With

The pipeline that found the bugs did real work before the PR was opened. The probes were already written. The crash was already confirmed under ASan. The fix direction was already known from the step3 adversarial analysis. The PoC was a three-line JavaScript file.

That is most of the Trail of Bits checklist done before the project ever sees the report. PoC: done. Fix patch: clear from the analysis. What remained was writing the patch in a way that compiles, passes CI, and meets project conventions. The pipeline does not know what tests.conf is. That part is manual.

The division of labor: the harness handles bug characterization, exploit primitive confirmation, and fix direction. The researcher handles project-specific conventions, CI environment differences, and the iteration loop between submission and merge. The first part is now faster than it used to be. The second part is still fully manual and takes roughly the same time regardless of how good the tooling is.


The Part That Took the Most Time

Not writing the fixes. Both fixes were one or two lines. The time went into understanding why the CI was failing in unexpected ways.

The clearest example: the test for the sign truncation bug expected arr.push(4) to throw when arr.length = 0x7FFFFFFF. On an ASan build with a limited address space, it throws. On a standard Linux build with memory overcommit, the allocation for a 16GB virtual array succeeds, the push completes, and no exception is raised. The test assertion was wrong for half the CI matrix.

Understanding that required knowing that QuickJS has a slow array path, that the slow path does not call expand_fast_array, that slow arrays can hold large indices as named properties without buffer allocation, and that memory overcommit behavior varies by environment and allocator configuration. None of that is in the bug report or the fix patch. All of it affects whether the test passes.

The Trail of Bits framing is right that each finding "needs a human to confirm the bug, a static or dynamic check to reproduce it, a working proof-of-concept, and a minimal patch." What it does not mention is that writing a test that correctly captures the expected behavior across all CI environments is harder than writing the patch. That part is the bottleneck.


What I Would Do Differently

Read the existing tests before writing a new one. The naming convention, the structure, what is expected. A few minutes of reading saves a round trip through review.

Check the CI matrix before assuming local behavior is representative. If the project runs ASan and UBSan, build locally with those flags before submitting. If the CI runs on Ubuntu and macOS, understand whether allocator behavior differs. It does.

Submit earlier with less polish. Both PRs went through more revision than necessary because I tried to get the test right before opening the PR. Opening early and letting CI find the environment differences would have been faster.

Write the regression test to assert the security property, not the symptom. "No crash" is the right assertion for a memory safety fix. "Throws InternalError" is a symptom that depends on implementation details. Security properties are stable. Symptoms are not.


Trail of Bits says frontier models will surface bugs at machine speed and the bottleneck is now triage, verification, and patching. That bottleneck is exactly where this post lives. The harness handles finding. This is what filling the rest of the bar actually looks like. The methodology for finding is replicable. The methodology for contributing is learnable. The two together are what closes the loop.


Who Else This Applies To

The OSS context is where this played out, but the problem is not specific to open source.

Security teams building defensive harnesses internally for development teams hit the same wall. A harness that produces findings faster than the team can verify and route them adds to the queue instead of shrinking it. A finding without a PoC, a patch, and a regression test is just a report. Internal teams get hundreds of those and act on a fraction. The harness has to produce artifacts that are actionable, not just accurate.

Bug bounty programs have the same shape. The programs that work well have figured out that the bottleneck is not finding volume. It is the cost of taking a valid finding from report to shipped fix. A submission that arrives with a working PoC, a suggested fix, and a test that demonstrates the fix holds reduces that cost significantly. It also changes the relationship. You are not handing a problem to someone. You are handing a solution that needs review.

The gap between finding and shipping is the same gap regardless of whether the recipient is an OSS maintainer, an internal security team, or a vendor's bug bounty triage queue. The skills that got these two QuickJS PRs merged, understanding the target's CI environment, writing assertions that hold across platforms, knowing how to scope a fix without introducing new surface, are the same skills that make any finding actionable in any context.

The harness accelerates the front end. Everything in this post is the back end. Both matter.