Why scanner probes show up in your error tracker (and what to do)

How to classify the `/.env`, `/wp-admin`, and `/.git/config` noise — and why you should keep them, not filter them.

·
securityscanners

You open your error tracker on a Monday morning to find 5,000 new "errors." Your stomach drops—production outage?—until you read them: 404s for wp-login.php, /.env, /.git/config, /phpmyadmin. You're running a Next.js app on Cloudflare. You don't even have a WordPress admin. Your first instinct is the "Ignore" button, or a regex filter to drop these forever and save your event quota. Stop. You're about to delete the most honest feedback your infrastructure ever receives.

These scanner probes feel like noise. They're actually reconnaissance data—the opening move of nearly every attack, captured and timestamped for free. This post makes the case for keeping them, shows how to classify them so they don't wreck your dashboard, and explains why the economics of flat-rate tracking flip the whole calculus.

The Anatomy of a Scanner Probe

These hits are automated requests looking for known-vulnerable files and endpoints. Understanding what they are is the first step to valuing them.

The common cold of the internet

Most probes are opportunistic: botnets spraying the entire IPv4 space, checking every host for the same short list of misconfigurations. Cloudflare and others consistently report that automated traffic makes up a large share of all web requests—a substantial fraction of everything hitting your origin is non-human. A smaller, more interesting subset is targeted: someone specifically interested in your app.

The usual suspects

The patterns are remarkably stable. You'll see /.env (hoping you committed secrets), /.git/config (hoping your repo is web-served), /.aws/credentials, /phpmyadmin and /wp-login.php (PHP-era CMS exploits), /wp-content/plugins/... (known plugin CVEs), and probes for backup files like /backup.sql or .zip archives. None of these apply to a modern Next.js app, which is exactly why their appearance is informative.

Why they hit your error tracker

When a probe hits a route that doesn't exist, your framework throws a routing exception or a 404, and a Sentry-style SDK often promotes that unhandled condition to an "Error." So requests for files you never had end up as red entries in a tool you check for code bugs. That's a feature being misread as a bug.

Why You've Been Programmed to Filter

The per-event pricing trap

Traditional error trackers bill per event. Bot traffic is high-volume, so every probe is a line item on your invoice. The vendor's own documentation often recommends filtering 404s to control spend. The economic incentive is explicit: you are charged for the privilege of seeing reconnaissance against your own infrastructure, so you're nudged to stop seeing it.

The "Ignore" reflex

Under that pressure, engineers optimize for a clean dashboard over system visibility, because on a per-event plan, volume equals cost. "Mute and forget" feels like good hygiene. It's actually throwing away intel to lower a bill.

The hidden cost of ignorance

When you drop 404 data at the edge, you lose the ability to notice when the background noise changes. A sudden surge in probes for one specific path is often the first observable sign of a freshly published CVE or a campaign aimed at you. The Verizon DBIR repeatedly notes that breaches frequently begin with reconnaissance that went unwatched. You can't baseline what you delete.

Probes Are Signal, Not Noise

The reconnaissance phase

Every exploit starts with a scan. These probes are the canary in the coal mine—the attacker telling you, in advance, what they're looking for. Keeping them turns your error tracker into a lightweight intrusion-detection system that costs you nothing extra to operate (on the right pricing model).

Targeted vs. opportunistic

The distinction matters and the data reveals it. A bot looking for any .env hits you once with a generic UA and moves on. Someone looking for your .env probes a sequence of paths specific to your stack, returns repeatedly, and may correlate with a real account. When probe patterns map onto your actual architecture rather than a generic checklist, escalate.

Correlation with session replay

The highest-value move is connecting a probe to subsequent legitimate-looking activity. A scan followed by a real login attempt from the same source is a different story than a drive-by 404—it may be an account-takeover attempt warming up. Watching that journey (with PII scrubbed—see masking PII in session replay) shows you what else they tried after the probe.

Step 1: Classification, Not Deletion

The goal is to keep the data while keeping your dashboard sane. The tool for that is custom fingerprinting plus tagging in beforeSend.

Custom fingerprinting

Group all scanner probes into a single issue so they don't flood your issue list as thousands of separate "errors." One collapsible "Scanner Probes" entry stays searchable and trend-able without burying real bugs.

Tagging the traffic

Add a security_signal: true tag (and drop the level to info) so these events route to a security view instead of paging the on-call engineer. In Next.js, the cleanest place to catch them is middleware:

// instrumentation / SDK config — classify, don't delete
const SCANNER_PATHS = [/\/\.env/, /\.git\/config/, /wp-login/, /phpmyadmin/];

Sentry.init({
  beforeSend(event) {
    const url = event.request?.url ?? '';
    if (SCANNER_PATHS.some((re) => re.test(url))) {
      event.level = 'info';
      event.fingerprint = ['scanner-probe', new URL(url).pathname];
      event.tags = { ...event.tags, security_signal: true, bot: true };
    }
    return event; // note: NOT 'return null' — we keep it.
  },
});

The key line is the last one. return event, not return null—you reclassify, you don't discard.

Step 2: Setting Up Low-Priority Triage

Mute vs. delete

Use your tracker's mute feature, not its filter. Muting keeps the events ingested and searchable—so the data is there when you need to investigate—while suppressing the noise from your active alert stream. Filtering throws the data away at the edge, permanently.

The security digest approach

Review scanner trends on a weekly cadence rather than event-by-event. A digest answers the questions that matter: which paths are trending, from where, and is anything new. That's a five-minute Friday habit, not a 3 AM page.

Anomaly detection on the baseline

Once you have a steady baseline of background probing, deviations become meaningful. A spike in probes for one path is a signal of a new 0-day or a targeted campaign. You can only see that spike if you kept the baseline—the same logic we apply to auth spike anomaly detection.

Leveraging Cloudflare for Context

Bot scores and cf-ray

Because GlitchReplay runs on Cloudflare, you can enrich probe events with the request's cf-ray and Cloudflare Bot Management score. Correlating your error-tracker data with the edge bot score tells you instantly whether a probe came from a known automated source or something trying harder to look human.

Geographic trends

Tagging probes with origin country reveals where campaigns cluster and whether any are slipping past your WAF rules. If probes for a sensitive path are landing despite a rule that should block them, you've found a gap. Our security headers checker and security docs help close the rest of that surface.

The GlitchReplay Advantage: No "Noise Tax"

Everything above competes with one force: the per-event bill. Flat-rate pricing removes it, and that changes the engineering culture around logging. When every 404, 401, and 403 costs the same as silence, the correct decision—log everything, keep total visibility—becomes the cheap decision too. The economic argument is laid out in full in flat-rate vs. per-event pricing.

With predictable bills, you never have to choose between security visibility and your budget. You keep the reconnaissance data, you baseline it, you watch the "hacker's" journey through your 404s with session replay, and you spot the targeted campaign hiding inside the opportunistic noise. Scanner probes aren't junk cluttering your dashboard—they're the earliest, most honest warning your infrastructure gets. Stop paying the noise tax to see your own security signals. Classify them, keep them, and let your error tracker do double duty as an IDS.

Stop watching your error bill spike.

GlitchReplay is Sentry-SDK compatible, includes session replay and security signals, and never charges per event. Free to start, five minutes to first event.