DevOps

Recruitment and knowledge question base. Filter, search and test your knowledge.

Topics

What is DevOps beyond tools, and how do you measure success?

mediumdevopsculturedora+1

Answer

DevOps is a culture and set of practices that align dev and ops around fast, reliable delivery with shared ownership. Success is measured by outcomes like deployment frequency, lead time for changes, change failure rate, and MTTR, plus user-impact metrics.

Describe the typical stages of a CI pipeline and common failure points.

easycipipelinebuild+1

Open question

Answer

A CI pipeline usually checks out code, builds, runs unit/integration tests, performs static analysis, packages artifacts, and publishes them. Failures often come from flaky tests, missing dependencies, environment drift, secrets/config issues, or non-deterministic builds.

Continuous delivery vs continuous deployment — what’s the difference?

easycdreleasedeployment

Open question

Answer

Continuous delivery keeps every change releasable and usually requires a manual approval to go to production. Continuous deployment automatically ships to production once the pipeline passes.

Rolling vs blue/green vs canary deployments — what are the tradeoffs?

mediumdeploymentrollingblue-green+1

Open question

Answer

Rolling updates replace instances gradually with minimal extra capacity but can expose partial issues. Blue/green keeps two environments and switches traffic at once; rollback is easy but costs more. Canary releases to a small % first to validate metrics, reducing risk but requiring strong monitoring.

What is GitOps and how does it change environment management?

mediumgitopsiacautomation

Open question

Answer

GitOps treats Git as the source of truth for desired state. Changes happen via pull requests and are reconciled to the target environment automatically, giving strong auditability and consistency.

Infrastructure as Code: why does idempotency matter and how do you validate changes safely?

mediumiacidempotencyterraform+1

Open question

Answer

Idempotency means applying the same config repeatedly yields the same state, enabling safe, repeatable provisioning. Validate changes with linting, plan/diff, policy checks, and a staging environment before production.

Configuration vs secrets — how should you manage them in DevOps?

easysecretsconfigsecurity

Open question

Answer

Configuration is non-sensitive and can be versioned. Secrets should live in a secret manager/KMS, be injected at runtime, rotated, and accessed with least privilege.

What are best practices for secure and small Docker images?

mediumdockercontainerssecurity+1

Open question

Answer

Use minimal base images, multi-stage builds, pin versions, remove build dependencies, use .dockerignore, run as non-root, and scan for vulnerabilities.

Kubernetes: when do you use Deployment vs StatefulSet vs DaemonSet?

mediumkubernetesdeploymentstatefulset+1

Open question

Answer

Deployment is for stateless workloads, StatefulSet for stateful apps needing stable identity/storage, and DaemonSet for running one pod per node (e.g., log agents).

Liveness vs readiness vs startup probes — what can go wrong if they’re misused?

mediumkuberneteshealth-checksprobes

Open question

Answer

Readiness gates traffic, liveness restarts unhealthy containers, and startup allows longer initialization. Misconfigured probes can cause restart loops or route traffic before the app is ready.

Logs vs metrics vs traces — how do they complement each other?

easyobservabilitylogsmetrics+1

Open question

Answer

Metrics show trends and health, logs provide event details, and traces follow a request across services. Together they help detect, diagnose, and explain incidents.

How do you design alerts to reduce noise and focus on user impact?

hardalertingslooncall

Open question

Answer

Alert on symptoms tied to SLOs, use burn‑rate/multi-window alerts, deduplicate, route to owners, and ensure every alert is actionable with a clear runbook.

What should a good incident response runbook include, and how do postmortems drive change?

mediumincidentrunbookpostmortem

Open question

Answer

A runbook includes detection steps, triage, mitigation/rollback, roles, escalation paths, and comms. Postmortems capture root cause and create owned action items to prevent recurrence.

Explain SLI, SLO, SLA, and error budgets, and how they influence releases.

hardslislosla+1

Open question

Answer

SLI is the measured metric, SLO is the target, SLA is the contractual promise. Error budget is the allowed failure (1 - SLO); if it’s burned, you slow releases and focus on reliability work.

Give three practical ways to control cloud costs without hurting reliability.

mediumcostautoscalingright-sizing+1

Open question

Answer

Right-size instances based on metrics, use autoscaling for variable load, and buy reserved/committed capacity for steady workloads. Add storage lifecycle policies and caching where it makes sense.