Cloud

Recruitment and knowledge question base. Filter, search and test your knowledge.

Topics

What is a microservice?

easymicroservicesarchitecturedistributed-systems

Answer

A microservice is a small service that does one business thing and can be deployed independently. It usually owns its data and talks to other services via APIs/events, so it can scale and release separately.

IaaS vs PaaS vs SaaS?

easycloud-computingservice-modeliaas+2

Open question

Answer

IaaS provides raw infrastructure like VMs, networking and storage; you manage the OS, runtime and app. PaaS provides a managed platform/runtime where you deploy code and the provider handles OS, scaling and patches. SaaS is a complete application delivered to end users.

Docker vs Virtual Machine?

mediumdockervirtualizationcontainerization+1

Open question

Answer

Containers (Docker) virtualize at the OS level: they share the host kernel, start quickly, and are lightweight. Virtual machines virtualize hardware: each VM runs its own guest OS/kernel, is heavier and slower to start, but provides stronger isolation.

Kubernetes Basic Concepts?

hardkubernetesorchestrationcontainer

Open question

Answer

Kubernetes is a container orchestration system. Core concepts include a cluster (control plane + worker nodes), pods as the smallest deployable unit running containers, deployments/statefulsets for desired replica state, services/ingress for networking, and ConfigMaps/Secrets and Namespaces for configuration and isolation.

Benefits of Serverless?

mediumserverlesscloudarchitecture+1

Open question

Answer

Serverless lets you run code without managing servers: the provider handles provisioning, scaling and patching. You pay per execution, can scale to zero, and deploy quickly. Trade‑offs include cold starts, execution limits and vendor lock‑in.

IaaS vs PaaS vs SaaS — what’s the difference?

easyiaaspaassaas+1

Open question

Answer

IaaS gives you infrastructure (VMs, networks), PaaS gives you a managed runtime/platform (deploy code, provider runs it), and SaaS is a ready-to-use application (you just use it).

Region vs Availability Zone: what’s the difference?

easycloudregionsavailability-zone+1

Open question

Answer

A region is a geographic area (e.g., `eu-central-1`) that contains multiple Availability Zones. An AZ is a physically separate data center (or group of them) within a region. Spreading across AZs improves availability; spreading across regions improves geo‑redundancy.

Horizontal vs vertical scaling: what’s the difference?

easycloudscalinghorizontal+1

Open question

Answer

Horizontal scaling adds more instances (scale out), improving redundancy and capacity. Vertical scaling makes a single instance bigger (scale up). Horizontal is usually more resilient; vertical can be simpler but has limits.

Blue/green vs canary deployment — what’s the difference?

mediumdeploymentblue-greencanary+1

Open question

Answer

Blue/green switches all traffic from old to new at once (with quick rollback). Canary rolls out to a small % first and gradually increases, reducing risk by observing metrics before full rollout.

L4 vs L7 load balancer — what’s the difference?

mediumload-balancerl4l7+1

Open question

Answer

L4 works on transport (TCP/UDP) and routes connections without understanding HTTP. L7 understands application protocols (HTTP) so it can route by path/headers, do TLS termination, and apply more advanced rules.

Metrics vs logs vs traces — how are they different?

mediumobservabilitymetricslogs+1

Open question

Answer

Metrics are numbers over time (CPU, latency), logs are discrete events/messages, and traces follow a single request across services (spans). Together they help you detect, diagnose, and understand incidents.

How do you design for high availability across failures (multi-AZ vs multi-region)?

hardhigh-availabilitymulti-azmulti-region+1

Open question

Answer

Multi-AZ protects you from a datacenter outage inside a region with lower latency and simpler ops. Multi-region can survive a full region outage but adds latency, data replication complexity, and higher costs.

At-least-once vs exactly-once delivery — why do we talk about idempotency?

hardmessagingidempotencyretries

Open question

Answer

With at-least-once delivery, a message can be delivered more than once (retries), so consumers must be idempotent (processing duplicates safely). Exactly-once is hard/expensive in practice, so idempotency is a common solution.

Secrets vs config — where should you store secrets in a cloud setup?

hardsecretskmssecurity

Open question

Answer

Store secrets in a secret manager (or encrypted KMS-backed store) and inject them at runtime (env/volume), not in git or plain config files. Rotate secrets and follow least privilege.

Give two practical ways to reduce cloud costs without hurting reliability.

hardcostright-sizingautoscaling+1

Open question

Answer

Right-size instances based on metrics and use autoscaling instead of overprovisioning. Add caching (CDN/app cache) and consider reserved/committed capacity for predictable workloads.

What is a CDN and what problem does it solve?

easycdncachingperformance

Open question

Answer

A CDN caches static content (images, JS, CSS) close to users. It reduces latency and offloads traffic from your origin servers, improving speed and resilience.

What is autoscaling and what is a common pitfall?

mediumautoscalingmetricsthrashing+1

Open question

Answer

Autoscaling adjusts the number of instances based on demand (metrics like CPU, latency, queue depth). A common pitfall is thrashing: scaling up/down too aggressively due to noisy metrics—use cooldowns and proper thresholds.

Infrastructure as Code (IaC) — why is it useful?

mediumiacterraformautomation+1

Open question

Answer

IaC defines infrastructure in code (Terraform/CloudFormation) so it’s versioned, reviewable, and reproducible. It reduces configuration drift and makes environments consistent and easier to automate.

IAM: what does “least privilege” mean and why does it matter?

hardiamsecurityleast-privilege

Open question

Answer

Least privilege means giving only the minimum permissions needed to do the job (no more). It limits blast radius: if a key or service is compromised, the attacker can do less damage.

What is a service mesh and when is it worth using?

hardservice-meshmtlssidecar+1

Open question

Answer

A service mesh adds a dedicated layer (often sidecar proxies) for service-to-service traffic: mTLS, retries, timeouts, and observability. It’s worth it when you have many services and need consistent networking/security controls, but it adds operational complexity.

Object storage vs block storage — what’s the difference?

easystorageobject-storageblock-storage+1

Open question

Answer

Object storage stores files as objects (key + data + metadata) and is great for blobs (images, backups). Block storage provides a raw disk volume for a VM; it’s good for databases and filesystems where you need low-latency random access.

Health checks: what are they and why do load balancers need them?

mediumhealth-checkload-balanceravailability

Open question

Answer

A health check tests if an instance is ready to serve traffic. Load balancers use it to remove unhealthy instances from rotation, improving availability and reducing user-facing errors.

RPO vs RTO — what do they mean?

mediumdrrporto+1

Open question

Answer

RPO (Recovery Point Objective) is how much data you can lose (time). RTO (Recovery Time Objective) is how long recovery can take. They drive backup frequency, replication, and DR design.

Disaster recovery: backup/restore vs warm standby vs active-active — what’s the trade-off?

harddisaster-recoveryfailovermulti-region+1

Open question

Answer

Backup/restore is cheapest but has higher RTO/RPO. Warm standby keeps a smaller live environment ready to scale during failover (better RTO). Active-active runs in multiple regions at once (best RTO, often best availability) but is the most complex and expensive.

Why separate environments/accounts for prod vs dev (and what do you gain)?

hardenvironmentssecurityblast-radius+1

Open question

Answer

Separation reduces blast radius and prevents accidents (e.g., deleting prod resources). It improves security and compliance, makes costs clearer, and lets you apply stricter policies/approvals in production.

What is a VPC (virtual private cloud) and why do you need it?

easycloudnetworkingvpc+1

Open question

Answer

A VPC is a private, isolated network in the cloud where you define IP ranges, subnets, routes, and firewall rules. You use it to control who can talk to what (e.g., keep databases private) and to connect securely to other networks (VPN/peering).

Public vs private subnet: what is the difference (in practice)?

mediumcloudnetworkingsubnet+1

Open question

Answer

A public subnet has a route to an Internet Gateway, so instances can be reached from the internet (with correct firewall rules). A private subnet has no direct inbound internet route; it’s commonly used for app servers and databases. Often the load balancer is public, while app/DB stay private.

What is a NAT gateway and when do you need it?

mediumcloudnetworkingnat+1

Open question

Answer

A NAT gateway lets instances in a private subnet make outbound connections to the internet (updates, external APIs) while staying unreachable from inbound internet traffic. It’s a common pattern: private app servers + NAT for outbound, public load balancer for inbound.

Multi-tenant vs single-tenant SaaS: what's the trade-off?

hardcloudsaasmulti-tenant+1

Open question

Answer

Multi-tenant shares infrastructure between customers (cheaper and easier to scale), but isolation and noisy-neighbor risks are harder. Single-tenant gives stronger isolation and simpler “per customer” limits, but costs more and is operationally heavier. Many systems use a hybrid approach.

Rate limiting in the cloud: where can you enforce it and why?

hardcloudrate-limitingwaf+1

Open question

Answer

Rate limiting protects your system from abuse and traffic spikes (often returning HTTP 429). You can enforce it at the edge (CDN/WAF), API gateway/load balancer, and in the app itself. Earlier enforcement saves resources, but the app still needs safeguards because not all traffic comes through one entry point.

What is a container registry and why do teams use it?

easycloudcontainersregistry+1

Open question

Answer

A container registry stores and serves container images (like a “Git repo for images”). Teams use it to version images, scan them for vulnerabilities, control access, and deploy the same tested image to different environments.

Kubernetes Service vs Ingress vs LoadBalancer: what does each do?

mediumkubernetesnetworkingingress+2

Open question

Answer

A Service gives stable networking to pods (stable name/IP) and load-balances inside the cluster. A LoadBalancer Service typically provisions a cloud L4 load balancer to expose the Service externally. Ingress is usually L7 HTTP routing (host/path rules, TLS) in front of Services, managed by an Ingress Controller.

Secrets rotation: how do you rotate credentials without downtime?

mediumcloudsecuritysecrets+1

Open question

Answer

Use overlapping validity: create a new secret version, deploy apps that can use the new secret, then revoke the old one. Prefer short-lived credentials where possible. Make sure apps reload secrets safely (restart/sidecar/reload hook) and monitor failures during the rollout.

Object storage consistency: why can you sometimes read stale data and how do you design for it?

hardcloudstorageconsistency+1

Open question

Answer

Depending on the provider and operation (especially overwrites and listings), object storage can behave like an eventually consistent system, so you may not see the newest state immediately. Design for it by avoiding overwrites (use unique keys/versioning), using retries with backoff, and not relying on immediate “list shows everything” semantics.

Kubernetes autoscaling: HPA vs Cluster Autoscaler - what does each scale?

hardkubernetesautoscalinghpa+1

Open question

Answer

HPA (Horizontal Pod Autoscaler) scales the number of pods based on metrics (CPU, memory, custom). Cluster Autoscaler scales the number of nodes in the cluster when pods can’t be scheduled due to lack of resources. They often work together: HPA adds pods, Cluster Autoscaler adds nodes if needed.

Public vs private subnet: what’s the difference and why use a NAT?

mediumcloudnetworkingsubnet+1

Open question

Answer

Public subnets can route to the internet (via an Internet Gateway). Private subnets have no direct inbound internet access. A NAT gateway lets instances in private subnets initiate outbound connections (e.g., to fetch updates) without being publicly reachable.

IAM users vs roles: what’s the difference and how does least privilege apply?

mediumcloudiamsecurity+1

Open question

Answer

Users are identities for people or long‑lived credentials. Roles are assumed by services or users for temporary access. Least privilege means granting only the minimal permissions needed, ideally via role‑based, time‑bound access.

What is a CDN and when should you use it?

easycloudcdnperformance+1

Open question

Answer

A CDN caches content at edge locations close to users, reducing latency and offloading origin servers. Use it for static assets, large files, and global audiences to improve performance and resilience.

Blue/green vs canary deployments: what’s the difference?

mediumclouddeploymentblue-green+1

Open question

Answer

Blue/green keeps two full environments and switches traffic from old to new in one step (fast rollback). Canary rolls out to a small percentage first and gradually increases, reducing risk but requiring monitoring and staged rollout logic.

Stateless vs stateful services in the cloud: why does it matter?

mediumcloudstatelessstateful+1

Open question

Answer

Stateless services don’t keep user/session state in memory, which makes them easy to scale and replace. Stateful services keep local state, so scaling requires sticky sessions, shared storage, or careful replication. Stateless designs are generally more cloud‑friendly.

RTO vs RPO: what do these disaster‑recovery metrics mean?

mediumclouddisaster-recoveryrto+1

Open question

Answer

RTO (Recovery Time Objective) is how quickly you must restore service after an outage. RPO (Recovery Point Objective) is how much data loss is acceptable (time between last good backup and failure).

Observability: how do metrics, logs, and traces differ?

mediumcloudobservabilitymetrics+2

Open question

Answer

Metrics are numeric time‑series (e.g., latency, error rate). Logs are detailed events with context. Traces link events across services to show end‑to‑end request flow. Together they help detect, diagnose, and explain incidents.

On‑demand vs reserved vs spot instances: what are the trade‑offs?

mediumcloudpricingon-demand+2

Open question

Answer

On‑demand is flexible but most expensive. Reserved (or savings plans) are cheaper for steady workloads but require commitment. Spot is the cheapest but can be interrupted, so it’s best for fault‑tolerant or batch jobs.