Session replay vs traditional logging: when each one wins

Replay is amazing for UI bugs and useless for backend race conditions. A decision matrix for picking the right tool per incident class.

June 6, 2026·

replayobservability

A user reports that the checkout button "just disappeared." Your ELK stack shows 200 OK across every microservice. Your structured logs show a perfectly valid session state. You spend four hours sprinkling console.log into a staging environment you can't make misbehave—only to discover, weeks later via a session replay, that a specific Chrome extension was injecting CSS that hid the button. No backend touched it. No log could have told you.

This is the Heisenbug, and it's the perfect illustration of why the "logs vs. replay" debate is the wrong frame. They're not competitors; they answer different questions. Logging is the king of distributed-system causality. Session replay is the only practical way to solve "the user said it didn't work" bugs. The skill is knowing which one to reach for, and this post is a decision matrix for exactly that.

The observability gap: events vs. experiences

Telemetry tells you what the system did. Replay shows you what the user saw. That's the whole distinction, and almost every wasted debugging hour comes from trying to answer one with the other.

Why structured logs fail at the glass

Logs live on the server, or at best capture discrete client events you remembered to instrument. They are blind to the rendered pixels. A log line reading { "event": "checkout_button_rendered", "visible": true } can be completely true while the button sits behind a modal overlay, off the bottom of a mobile viewport, or hidden by a z-index war with a third-party widget. The DOM said it rendered; the user's eyes said it wasn't there. Only one of those is the bug report.

The technical overhead

People assume replay must be heavy because it sounds like video. It isn't video. Tools like rrweb serialize the initial DOM once, then stream a compact log of mutations—the diffs—thereafter. A click that a log captures as a one-line JSON entry is, in replay terms, a small mutation record plus a pointer event. The mental model worth holding: logs are cheap strings about events; replay is a cheap stream of DOM changes that reconstructs the experience.

When session replay wins: frontend-first incidents

Replay dominates for bugs that are context-heavy and visually dependent—the ones where the failure is in what rendered, not in what computed.

The "I can't reproduce this" bug

State-specific UI failures are replay's home turf. The bug only appears when a user has a particular item in localStorage, a specific viewport, a certain sequence of clicks, or a stale cached asset. You can't reproduce it because you don't have their state. Replay is their state—you watch the exact session that broke.

CSS and layout regressions

Logging "Button Clicked" is worthless if the button was off-screen, transparent, or covered. Layout shifts, broken flexbox, a devicePixelRatio quirk on a Retina display—none of these emit a log line, because from the code's perspective nothing went wrong. A war story of exactly this kind is in debugging a phantom checkout bug with session replay.

User error vs. system error

Did the user double-click and submit twice, or did the API lag and they retried? The log shows two requests either way. The replay shows the cursor, the timing, the spinner—and tells you instantly whether to fix your debounce logic or your backend latency. A quick triage heuristic: if the bug involves z-index, localStorage, autofill, or anything pixel-dependent, open the replay first.

When traditional logging wins: backend-heavy incidents

It would be dishonest to oversell replay. For a large class of problems it's useless, and reaching for it wastes time.

Race conditions in distributed systems

When two services interleave writes, or a message is processed out of order, the bug lives entirely in the backend timeline. There is nothing to watch on the user's screen—the symptom (wrong data) appears long after the cause. Structured logs with correlation IDs and timestamps are the only tool that reconstructs that sequence.

Database deadlocks and cache invalidation

A deadlock, a stale cache entry, a connection-pool exhaustion—these are server-side state problems. The frontend is an innocent bystander rendering whatever it was handed. Logs and traces are where the answer lives.

High-cardinality analysis

You cannot watch a million replays to find a p99 latency spike. Aggregate questions—"which endpoint regressed," "what's the error rate by region"—are fundamentally statistical, and replay is a per-session microscope, not a telescope. A single OpenTelemetry trace showing a 500 buried three services deep is invisible to any frontend replay, because the frontend just saw a generic error toast.

The decision matrix

Here's the quick-reference grid, organized by the nature of the bug:

UI / UX (button hidden, layout broken, autofill glitch, "works on my machine"): Session replay. The failure is visual and state-dependent.
Network (timeouts, retries, CORS, a failing third-party call): Both. Replay shows the user impact and timing; logs show the request/response detail.
Logic / backend (race conditions, deadlocks, wrong calculation, data corruption): Logging and tracing. Nothing useful to watch.
Security (credential stuffing, scanner probes, injection attempts): Logging. High-volume, aggregate, server-side.

And the cost reality that shapes all of this: logging everything verbosely gets expensive fast, and replaying everything is impossible to review by hand. You want full-fidelity logs for aggregate causality and full-capture replay you can jump into when an error points you there—which is the entire reason the two need to be linked.

Integration: the unified trace

The real unlock isn't choosing one tool—it's connecting them so a log line becomes a one-click jump into the moment of impact. The bridge is a shared identifier: attach the replay_id to your backend log metadata, and a 500 in your logs becomes a hyperlink to the exact second the user experienced it.

import * as Sentry from "@sentry/browser";

// Grab the active replay id and stamp it on every outbound request
const replayId = Sentry.getReplay()?.getReplayId();

await fetch("/api/checkout", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "X-Replay-Id": replayId ?? "none",
  },
  body: JSON.stringify(payload),
});

// On the server, log it alongside the error:
// logger.error("checkout failed", { replayId: req.headers["x-replay-id"] });

Now your on-call engineer reads a backend error, copies the replay_id, and watches what the user did in the seconds before the failure. That round trip—from aggregate log to individual experience—is what collapses mean-time-to-resolution. Because GlitchReplay is compatible with the Sentry SDK, this wiring uses the instrumentation you already have. The replay docs cover the linking in detail.

The privacy and compliance hurdle

Recording real user sessions raises an obvious question: what about the credit-card field, the SSN, the email address? The answer is masking, and it works differently across the two tools.

DOM masking vs. RegEx masking

Logs are scrubbed with pattern matching—RegEx that redacts anything shaped like a card number or token before the string is written. Replays are scrubbed at the DOM level: sensitive inputs are masked so the recording captures the behavior (a field was filled, the user tabbed to the next one) without ever capturing the contents. GlitchReplay defaults to masking all text inputs, and you can tune it per element.

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  integrations: [
    Sentry.replayIntegration({
      maskAllText: true,     // text content is masked by default
      maskAllInputs: true,   // every <input> value is hidden
      blockAllMedia: true,   // images/video not recorded
    }),
  ],
  replaysSessionSampleRate: 1.0,
});

For the deeper treatment of masking strategy, see masking PII in session replay and the edge PII scrubbing tool.

Does replay slow down Time to Interactive?

It's a fair worry. Replay relies on a MutationObserver, which has a real but small cost. A well-built recorder batches mutations and stays off the critical path; the impact on Time to Interactive is typically negligible for normal apps. If you're instrumentation-sensitive, measure it yourself with our Core Web Vitals checker before and after enabling replay—don't take anyone's word for it, including ours.

Stop guessing, start watching (where appropriate)

The shift this post is really arguing for is from "what happened?" to "why did it happen?"—and recognizing that those two questions are answered by different instruments. Logs and traces tell you what the system did and let you reason about causality across services. Replay tells you what the human in front of the screen actually experienced. Teams that link the two cut their mean-time-to-resolution dramatically, because the worst part of debugging—reproducing the bug—is replaced by simply watching it.

Build the balanced stack: keep your logs for backend causality and aggregate analysis, add full-capture replay for the frontend bugs logs can never see, and link them with a shared ID so neither tool is an island. If you want that replay layer without metering it down to save money, GlitchReplay records full sessions on a flat rate and speaks the Sentry SDK you already use—so the next time a user says "the button disappeared," you stop guessing and just press play.

Stop watching your error bill spike.

GlitchReplay is Sentry-SDK compatible, includes session replay and security signals, and never charges per event. Free to start, five minutes to first event.

Get started — free Read the docs