If you’ve built something on Cloudflare Workers, you already know the problem. wrangler tail works great during development. The moment you ship to production, the gap between “something went wrong” and “I know exactly what went wrong” gets very wide very fast.
This post covers why traditional logging tools are a poor fit for Workers, what you actually need, and how to set up production observability in under 10 minutes.
Why wrangler logs isn’t enough
wrangler tail streams log lines from your Worker to your terminal in real time. It’s useful for debugging locally. It’s not useful for:
- Incidents that happened while you weren’t watching
- Errors across multiple Workers or environments
- Pattern detection (“is this a new error or a recurring one?”)
- Volume-based alerting (“fire when error rate exceeds 1%”)
- Any query more complex than
ctrl+Fin a terminal
Workers don’t have a persistent process. Each request is its own invocation. You can’t just tail a server log file the way you would with a traditional Node.js app, because there’s no server and no file.
The scaling problem with traditional logging
Standard advice says to pipe your logs to a log aggregator. Datadog, Elastic, Splunk, New Relic — they all work fine for traditional servers. Workers are different in a few ways that cause friction:
No agents. Traditional logging relies on a sidecar agent or daemon that reads logs from disk or intercepts stdout. Workers don’t have a filesystem or a persistent process to run an agent in.
Invocation-per-request model. A Worker that handles 1M requests/day produces 1M log lines/day. The per-event pricing of most enterprise tools gets expensive fast.
Global execution. Workers run at Cloudflare’s edge across 300+ PoPs. Log collectors that expect data to arrive from one region get confused.
Volume bursts. Workers can go from zero to 10,000 concurrent requests in a second if you’re behind a popular content site or have a viral moment. Your logging infrastructure needs to absorb that spike, not drop events.
The cost issue with enterprise observability tools
Datadog charges roughly $0.10 per GB ingested plus separate per-host fees plus retention charges. A Workers app that processes 100 GB/month of log data is looking at $50–$200/month in logging costs alone before any other Datadog products.
That’s often more than the entire Cloudflare Workers bill.
Smaller alternatives — Better Stack, Axiom, Logtail — are cheaper but still priced on data volume, which means every optimization in your Worker to reduce log verbosity is also an optimization to reduce your bill. That’s the wrong incentive.
Introducing ScryWatch
ScryWatch is a log monitoring platform built specifically for Cloudflare Workers. It runs entirely on Cloudflare infrastructure — Workers, D1, R2, Queues, Durable Objects — which means:
- No agents. You send logs to an API endpoint, same as any other HTTP call from your Worker.
- Priced by events, not by data size. 10M events/month on the Growth plan ($39/mo), regardless of whether those events are 100 bytes or 10 KB.
- Archive stored in R2. Long-term log storage goes to Cloudflare R2 at $0.015/GB/month — no egress fees. A year of compressed logs from most Workers apps costs a few dollars.
On top of raw log storage, ScryWatch extracts pattern intelligence: it groups similar log messages by template, tracks frequency over time, and detects regressions after deploys automatically.
Quick start setup
Send your first log in 5 minutes:
# Install the SDK
npm install @scrywatch/sdk
import { LogClient } from '@scrywatch/sdk';
const logger = new LogClient({
apiKey: process.env.SCRYWATCH_API_KEY,
service: 'my-worker',
environment: 'production',
});
// In your Worker handler
export default {
async fetch(request: Request, env: Env): Promise<Response> {
try {
const result = await handleRequest(request, env);
logger.info('Request handled', {
path: new URL(request.url).pathname,
status: 200,
duration_ms: result.durationMs,
});
return result.response;
} catch (err) {
logger.error('Request failed', {
path: new URL(request.url).pathname,
error: String(err),
});
return new Response('Internal Server Error', { status: 500 });
}
},
};
No SDK required if you prefer direct HTTP:
ctx.waitUntil(
fetch('https://api.scrywatch.com/api/ingest', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.SCRYWATCH_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
events: [{
level: 'error',
message: 'Payment failed',
service: 'checkout-worker',
user_id: userId,
}],
}),
})
);
Using ctx.waitUntil ensures the log send doesn’t block your response and completes even after the response is returned.
Architecture overview
Here’s how ScryWatch processes your logs:
Ingest Worker (edge) — Your events hit a Cloudflare Worker that validates structure, computes a deterministic event_id (so retries never create duplicates), extracts a normalized message_template for pattern grouping, and enqueues the batch.
Cloudflare Queue — The queue absorbs burst traffic. If your Worker sees a sudden spike, events buffer in the queue and drain smoothly rather than overwhelming the database layer.
Consumer Worker — Processes queue batches asynchronously. Archives events to R2, indexes the last 24 hours in D1 for fast search, and upserts pattern aggregates.
Storage — D1 (SQLite at the edge) holds recent events for sub-second filtered queries. R2 holds the complete archive. When you query dates older than 24 hours, the dashboard automatically searches R2 — you don’t need to do anything differently.
Dashboard — Real-time Log Explorer, pattern grouping, deploy diff, live tail, alerts, and AI-generated summaries of recent activity.
If you’re already on Cloudflare Workers and want to add real observability without spinning up a separate monitoring stack, ScryWatch is worth a look. Start free — no credit card required.