Debugging a phantom checkout bug with session replay
A real war story: an intermittent Stripe failure that only happened on iOS Safari with autofill. Replay closed the loop in 20 minutes.
It's 4:00 PM on a Tuesday when the "Incomplete Checkout" alerts start spiking in Slack. The logs show a handful of 402 Request Failed errors from Stripe, but the stack traces are pristine—no code is crashing, yet users aren't finishing their purchases. You try to reproduce it on your Mac. Nothing. Your colleague tries on Android. Works fine. The QA lead pulls out a physical iPhone and runs the whole funnel. Flawless. This is the phantom bug: a silent killer of conversion that traditional telemetry is fundamentally blind to.
This is a war story about one such bug—an intermittent Stripe failure that only happened on iOS Safari with autofill—and how session replay turned a ticket that had been "can't reproduce" for three days into a twenty-minute fix. The lesson underneath it is simple: when the failure is in what the user did, not in what your code threw, you need to see the user.
The "works on my machine" wall
Every ecommerce team hits this wall. The error tracker is doing its job perfectly—it's just answering the wrong question.
The limits of the stack trace
A 200 OK tells you the request completed. A generic 400 tells you Stripe rejected something. Neither tells you why the payload that reached Stripe was malformed in the first place. Our handler looked airtight:
async function handlePayment(formState) {
// Looks safe — every field is "validated" before we get here.
const { error } = await stripe.confirmPayment({
elements,
confirmParams: {
payment_method_data: {
billing_details: {
name: formState.name,
email: formState.email,
address: { postal_code: formState.zip },
},
},
},
});
if (error) reportToTracker(error); // fires the 402, tells us nothing
}The validation passed. The submit fired. Stripe said no. The stack trace points at the line that called Stripe—which is exactly where you'd expect a payment to be, and therefore useless.
The cost of "can't reproduce"
A ticket stuck in "can't reproduce" isn't free. It burns developer hours in speculative debugging, and every day it lingers it's quietly costing real revenue—each phantom failure is a customer who tried to give you money and couldn't. The reproducibility crisis is the single most expensive phase of the modern dev cycle, and stack traces don't end it.
Anatomy of a phantom: the iOS Safari autofill glitch
The breakthrough started with a pattern in the metadata: every failing session was iOS Safari. Never Chrome, never Android, never desktop Safari. That narrowed it to something iOS-specific in the input layer.
The QuickType factor
iOS Safari's autofill—surfaced through the QuickType suggestion bar above the keyboard—does not fire input events the way manual typing does. When a user taps a saved address suggestion, the browser populates the field's visible value, but the sequence and timing of the events it dispatches differs from a keystroke-by-keystroke entry. React and Vue state, which hangs off those events, can end up out of sync with what the user plainly sees on screen.
Why logs only showed the symptom
The server-side log showed a Stripe request with a missing billing_details.address.postal_code—null, despite the field visibly containing a ZIP code:
{
"event": "stripe_confirm_payment",
"outcome": "402",
"billing_details": {
"name": "Jordan Lee",
"email": "jordan@example.com",
"address": { "postal_code": null } // <- the field LOOKED filled
}
}From the log's perspective the app behaved correctly: it sent what was in state. The bug was that state never received the autofilled value. No log line could capture the gap between "what the screen showed" and "what the JavaScript knew."
Seeing is believing: the replay breakthrough
We opened the GlitchReplay dashboard, filtered to the failing 402 events, and clicked through to the linked session.
The 20-minute timeline
Because the Stripe error carried the same event_id the SDK had stamped on the session, the jump from "error in the dashboard" to "watching the exact session" was a single click. Within twenty minutes of opening the first replay, the cause was obvious—not inferred, observed.
Correlating the error to the video
On screen: the user tapped the address field, the QuickType bar offered their saved address, they tapped it. The ZIP field visibly filled with 94107. Then—and this was the tell—the "Pay" button became active immediately, before the framework had committed the autofilled value to state. The user tapped Pay. The handler read an empty zip. Stripe got a null. 402.
That's the entire bug, and no amount of log archaeology would have surfaced it. We'd been looking for a code defect; the defect was in the timing between the browser's autofill and our state update. This is the core argument for replay over logs, covered more broadly in session replay vs. logging.
The technical root cause: race conditions in autofill
With the behavior identified, the mechanism was easy to confirm.
onChange vs. onInput vs. onBlur
Different browsers fire different events during autofill, and iOS Safari is the outlier. A manual keystroke reliably triggers input, which React listens to. An autofill tap may not fire input in the same way or at the same time, so the synthetic onChange React expects never runs—or runs after the user has already moved on. The field's DOM value is set; React's state is not.
The ghost value problem
This is the heart of it: the browser's rendered UI shows text in a field while the underlying JavaScript state for that field is still empty. The user sees a complete form. The app sees a half-empty one. The fix was to stop trusting state alone at submit time and read the actual DOM value as a fallback:
// Buggy: trusts framework state, which autofill may not have updated
const zip = formState.zip;
// Robust: read the real input value at submit, catching ghost values
const zipEl = formRef.current.querySelector('input[name="zip"]');
const zip = formState.zip || zipEl?.value || "";
// Better still: listen for the autofill via animationstart on
// the :-webkit-autofill pseudo-class, and sync state when it fires.Privacy without sacrifice: masking checkout data
The obvious objection to recording checkout sessions is privacy—and it's a good objection. You absolutely cannot record credit-card numbers.
Capturing behavior, not secrets
The point of replay here is the behavior: the tap on the suggestion, the premature button activation, the order of events. None of that requires the actual card number or the real name. DOM-level masking captures the interaction while redacting the contents. We saw "a field was filled then Pay was tapped" without ever seeing what was typed.
The mask-all approach
GlitchReplay defaults to masking text and inputs, which keeps you on the right side of GDPR and PCI while still showing layout, gestures, and timing:
Sentry.init({
dsn: process.env.SENTRY_DSN,
integrations: [
Sentry.replayIntegration({
maskAllText: true,
maskAllInputs: true, // card numbers, ZIPs, names — all masked
}),
],
replaysSessionSampleRate: 1.0,
});More on the strategy in masking PII in session replay and the edge scrubbing tool.
Why 100% capture matters for intermittent bugs
Notice the replaysSessionSampleRate: 1.0 above. It's the reason this story has a happy ending.
The 1% sampling trap
To control costs on per-event platforms, many teams sample replay down to 1% of sessions. Do the math on a phantom bug: if you only record 1 in 100 sessions, you have a 99% chance of missing any given failure. An intermittent, browser-specific bug that hits a fraction of iOS users would almost never land in your sample. You'd have the alerts, the 402s, and zero replays to explain them—back to square one.
Flat-rate vs. per-event
Full capture is only sane when it's affordable. Flat-rate pricing is what lets you keep the cameras rolling on the entire checkout funnel without watching a meter—the economics are spelled out in flat-rate vs. per-event pricing. The ROI here is stark: a month of flat-rate replay costs less than the revenue lost to a few dozen failed checkouts, and this one bug had been failing dozens of carts a day for three days before we caught it.
Lessons learned: a more resilient checkout
Two takeaways outlast this particular bug. First, defensive state management: never trust framework state alone for payment-critical fields—read and validate the actual DOM value at submit time, and treat autofill as an event you must explicitly handle. Second, close the feedback loop: session replay should be the first tool you reach for on a "can't reproduce" ticket, not the last resort after three days of guessing.
The phantom checkout bug isn't rare—it's the default experience of debugging anything that depends on a specific browser, a specific input method, or a specific user's state. Stop trying to reproduce it on your machine. With Sentry-compatible, flat-rate, full-capture replay from GlitchReplay, you stop guessing why checkouts fail and just watch the exact tap that broke them. Set up session replay and the next phantom bug becomes a twenty-minute fix instead of a three-day mystery.
GlitchReplay is Sentry-SDK compatible, includes session replay and security signals, and never charges per event. Free to start, five minutes to first event.