Cloudmedium

Metrics vs logs vs traces — how are they different?

Answer

Metrics are numbers over time (CPU, latency), logs are discrete events/messages, and traces follow a single request across services (spans). Together they help you detect, diagnose, and understand incidents.

Advanced answer

Deep dive

Think of observability signals as answering different questions:

**Metrics**: “Is the system healthy?” Aggregated numbers over time (RPS, p95 latency, error rate). Great for dashboards, SLOs, and alerting.
**Logs**: “What happened?” Discrete event records with context (userId, orderId). Great for debugging and audits.
**Traces**: “Where did time go?” A distributed view of a single request across services, broken into spans (DB call, HTTP call, cache).

How they work together

A common workflow: 1) Alert triggers from metrics (error rate spike). 2) Pivot to traces to find the slow/failing hop. 3) Read logs for the specific trace/request to see exact errors and context.

Practical tips

Put a correlation id / trace id into logs.
Avoid high-cardinality labels in metrics (cost + performance).
Use sampling for traces in high-traffic systems.

Common pitfalls

Logging too much at info level and exploding costs.
Missing structured logs (hard to query).