How do you debug Cloudflare Workers in production?

Use structured logging with request IDs that trace through all services. Send logs to a centralized service like Logflare or Datadog. Include context like user ID, worker name, and environment in every log entry.

What metrics should you track for Workers?

Track request count, error rate, p50/p95/p99 latency, CPU time, and memory usage. For business metrics, track conversion rates, API call counts, and revenue-impacting events. Alert on anomalies, not thresholds.

How do you set up alerting for edge functions?

Use Cloudflare Analytics for basic metrics, send custom events to a monitoring service, and set up Slack alerts for critical errors. Cron Triggers can run health checks every minute and alert on failures.

DevOps Observability

Monitoring & Observability
at the Edge

Structured logging, real-time metrics, alerting strategies, and debugging patterns for 28 Cloudflare Workers in production.

📖 13 min read January 24, 2026

You can't fix what you can't see. With 28 Workers processing requests across 300+ edge locations, observability isn't optional—it's the difference between "we noticed a 10% revenue drop" and "we fixed the bug before users noticed."

Here's the monitoring stack keeping our edge infrastructure visible and debuggable.

Workers Monitored

2.1M

Requests/Month

23ms

Avg Latency

99.97%

Success Rate

Pattern 1: Structured Logging

Every log entry follows the same structure. No exceptions:

logger.ts

interface LogEntry {
  timestamp: string;
  level: 'debug' | 'info' | 'warn' | 'error';
  requestId: string;
  worker: string;
  environment: string;
  message: string;
  data?: Record<string, any>;
  error?: {
    name: string;
    message: string;
    stack?: string;
  };
  duration?: number;
  cf?: {
    colo: string;
    country: string;
  };
}

class Logger {
  constructor(
    private worker: string,
    private requestId: string,
    private cf?: IncomingRequestCfProperties
  ) {}
  
  info(message: string, data?: Record<string, any>) {
    this.log('info', message, data);
  }
  
  error(message: string, error: Error, data?: Record<string, any>) {
    this.log('error', message, {
      ...data,
      error: {
        name: error.name,
        message: error.message,
        stack: error.stack
      }
    });
  }
  
  private log(level: LogEntry['level'], message: string, data?: any) {
    const entry: LogEntry = {
      timestamp: new Date().toISOString(),
      level,
      requestId: this.requestId,
      worker: this.worker,
      environment: ENV,
      message,
      data,
      cf: this.cf ? { colo: this.cf.colo, country: this.cf.country } : undefined
    };
    
    console.log(JSON.stringify(entry));
  }
}
                

Pattern 2: Request Tracing

One request ID flows through all services:

tracing.ts

export function withTracing(handler: Handler): Handler {
  return async (request, env, ctx) => {
    // Get or create request ID
    const requestId = request.headers.get('X-Request-ID') 
      || crypto.randomUUID();
    
    const startTime = Date.now();
    const logger = new Logger('api-gateway', requestId, request.cf);
    
    logger.info('Request started', {
      method: request.method,
      url: request.url,
      userAgent: request.headers.get('User-Agent')
    });
    
    try {
      const response = await handler(request, env, ctx);
      
      logger.info('Request completed', {
        status: response.status,
        duration: Date.now() - startTime
      });
      
      // Add tracing headers to response
      const headers = new Headers(response.headers);
      headers.set('X-Request-ID', requestId);
      headers.set('X-Response-Time', `${Date.now() - startTime}ms`);
      
      return new Response(response.body, { ...response, headers });
      
    } catch (error) {
      logger.error('Request failed', error, {
        duration: Date.now() - startTime
      });
      throw error;
    }
  };
}
                

Pattern 3: Real-Time Metrics

Push metrics to Analytics Engine or external services:

metrics.ts

export function trackMetrics(
  request: Request,
  response: Response,
  duration: number,
  ctx: ExecutionContext,
  env: Env
) {
  const datapoint = {
    // Dimensions (groupable)
    worker: 'api-gateway',
    method: request.method,
    path: new URL(request.url).pathname,
    status: response.status.toString(),
    statusGroup: Math.floor(response.status / 100) + 'xx',
    colo: request.cf?.colo || 'unknown',
    country: request.cf?.country || 'unknown',
    
    // Metrics (aggregatable)
    count: 1,
    duration,
    success: response.ok ? 1 : 0,
    error: response.ok ? 0 : 1
  };
  
  // Fire and forget
  ctx.waitUntil(
    env.ANALYTICS.writeDataPoint({
      blobs: [datapoint.worker, datapoint.method, datapoint.path],
      doubles: [datapoint.duration, datapoint.count],
      indexes: [datapoint.status]
    })
  );
}
                

Pattern 4: Health Checks with Cron

health-monitor.ts

const ENDPOINTS = [
  { name: 'API Gateway', url: 'https://api.proptechusa.ai/health' },
  { name: 'Lead Processor', url: 'https://leads.proptechusa.ai/health' },
  { name: 'AI Chatbot', url: 'https://chat.proptechusa.ai/health' },
];

export default {
  async scheduled(event: ScheduledEvent, env: Env, ctx: ExecutionContext) {
    const results = await Promise.all(
      ENDPOINTS.map(async (endpoint) => {
        const start = Date.now();
        try {
          const res = await fetch(endpoint.url, { 
            signal: AbortSignal.timeout(5000) 
          });
          return {
            name: endpoint.name,
            healthy: res.ok,
            latency: Date.now() - start,
            status: res.status
          };
        } catch (e) {
          return {
            name: endpoint.name,
            healthy: false,
            latency: Date.now() - start,
            error: e.message
          };
        }
      })
    );
    
    const unhealthy = results.filter(r => !r.healthy);
    
    if (unhealthy.length > 0) {
      await sendSlackAlert({
        text: `🚨 Health Check Failed`,
        blocks: unhealthy.map(r => ({
          type: 'section',
          text: { type: 'mrkdwn', text: `*${r.name}*: ${r.error || r.status}` }
        }))
      }, env);
    }
  }
};
                

Pattern 5: Error Alerting

alerting.ts

async function sendSlackAlert(
  error: Error,
  context: {
    requestId: string;
    worker: string;
    url: string;
  },
  env: Env
) {
  await fetch(env.SLACK_WEBHOOK_URL, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      blocks: [
        {
          type: 'header',
          text: { type: 'plain_text', text: '🚨 Production Error' }
        },
        {
          type: 'section',
          fields: [
            { type: 'mrkdwn', text: `*Worker:*\n${context.worker}` },
            { type: 'mrkdwn', text: `*Request ID:*\n\`${context.requestId}\`` },
            { type: 'mrkdwn', text: `*Error:*\n${error.message}` },
            { type: 'mrkdwn', text: `*URL:*\n${context.url}` }
          ]
        },
        {
          type: 'section',
          text: { 
            type: 'mrkdwn', 
            text: `\`\`\`${error.stack?.slice(0, 500)}\`\`\``
          }
        }
      ]
    })
  });
}
                

Alert Fatigue Prevention

Deduplicate alerts by error signature. Send one alert for the first occurrence, then aggregate. "This error occurred 47 times in the last 5 minutes" is more useful than 47 individual alerts.

Observability Checklist

Structured JSON logs with consistent schema
Request IDs propagated through all services
Latency tracking at p50, p95, p99 percentiles
Error rate monitoring with alerting thresholds
Health checks running every minute via Cron
Slack alerts for critical errors and outages
Dashboard for real-time metrics visualization
Log retention for debugging historical issues

Observability isn't about collecting data—it's about answering questions. "Why did that request fail?" should take seconds to answer, not hours of digging through logs.

28 Cloudflare Workers Architecture

CI/CD for Cloudflare Workers

Designing for Model Failure

Need Observability Setup?

We build monitoring systems that catch issues before users do.

→ Get Started

Pattern 1: Structured Logging

Pattern 2: Request Tracing

Pattern 3: Real-Time Metrics

Pattern 4: Health Checks with Cron

Pattern 5: Error Alerting

Observability Checklist

Related Articles

Need Observability Setup?