Interview kitsBlog

Your dream job? Lets Git IT.
Interactive technical interview preparation platform designed for modern developers.

XGitHub

Platform

  • Categories

Resources

  • Blog
  • About the app
  • FAQ
  • Feedback

Legal

  • Privacy Policy
  • Terms of Service

© 2026 LetsGit.IT. All rights reserved.

LetsGit.IT/Categories/Architecture
Architecturemedium

What is an SLO and what is an error budget?

Tags
#slo#error-budget#reliability
Back to categoryPractice quiz

Answer

An SLO (Service Level Objective) is a target for an SLI (e.g., 99.9% availability). The error budget is the allowed “room for failure” (100% - SLO). Teams use it to balance shipping features vs improving reliability.

Advanced answer

Deep dive

Expanding on the short answer — what usually matters in practice:

  • Context (tags): slo, error-budget, reliability
  • Scaling: what scales horizontally vs vertically, where bottlenecks appear.
  • Reliability: retries/circuit breakers/idempotency, observability (logs/metrics/traces).
  • Evolution: keep changes cheap (boundaries, contracts, tests).
  • Explain the "why", not just the "what" (intuition + consequences).
  • Trade-offs: what you gain/lose (time, memory, complexity, risk).
  • Edge cases: empty inputs, large inputs, invalid inputs, concurrency.

Examples

A tiny example (an explanation template):

// Example: discuss trade-offs for "what-is-an-slo-and-what-is-an-error-budget?"
function explain() {
  // Start from the core idea:
  // An SLO (Service Level Objective) is a target for an SLI (e.g., 99.9% availability). The er
}

Common pitfalls

  • Too generic: no concrete trade-offs or examples.
  • Mixing average-case and worst-case (e.g., complexity).
  • Ignoring constraints: memory, concurrency, network/disk costs.

Interview follow-ups

  • When would you choose an alternative and why?
  • What production issues show up and how do you diagnose them?
  • How would you test edge cases?

Related questions

Architecture
What is a non-functional requirement (NFR)? Give a few examples.
#architecture#nfr#quality-attributes
Architecture
What is a blameless postmortem and why is it useful?
#postmortem#incident#culture
Architecture
What is an SLI (Service Level Indicator)?
#sli
#reliability
#metrics
Observability
How do you measure and improve MTTR?
#mttr#incident-response#reliability
Observability
What dashboards are must-have for a critical API?
#dashboards#red#slo
Observability
How do you design actionable alerts to reduce noise?
#alerting#slo#oncall