Databases
Recruitment and knowledge question base. Filter, search and test your knowledge.
mediumhibernateormperformance+1
Answer
The N+1 problem appears when Hibernate loads N parent entities with one query, and then lazily loads each parent’s children with an additional query, resulting in N extra selects. Fix it with fetch joins/entity graphs, batch fetching, or adjusting fetch strategy.
mediumtransactionaciddatabase-theory
Answer
ACID describes transactional guarantees: Atomicity (all‑or‑nothing), Consistency (invariants preserved), Isolation (concurrent transactions behave as if serialized), and Durability (committed data survives crashes).
hardindexingb-treehash+1
Answer
B‑tree indexes keep keys ordered, so they support equality lookups, range scans, prefix queries and ORDER BY efficiently. Hash indexes map keys to buckets for fast equality lookups, but they don’t support ranges or ordering.
hardnormalizationdatabase-designtheory
Answer
Normalization reduces redundancy. 1NF requires atomic columns and no repeating groups. 2NF is 1NF plus every non‑key attribute depends on the whole key. 3NF is 2NF plus no transitive dependencies between non‑key attributes.
mediumsqlnosqlcomparison+1
Answer
SQL databases are relational with fixed schemas, joins and strong ACID transactions. NoSQL databases are non‑relational (document, key‑value, column, graph), often with flexible schemas and easier horizontal scaling, sometimes trading strict consistency for availability/performance.
easyprimary-keyuniqueindex+1
Answer
A primary key uniquely identifies a row (and can be referenced by FKs). A unique constraint also enforces uniqueness, but isn’t “the” row identity. An index is a data structure that speeds up reads; it may be unique or not.
easytransactionacidconsistency
Answer
A transaction groups multiple operations into one unit of work: either all succeed or all are rolled back. It protects data consistency, especially with concurrent users and failures.
easyindexperformancetradeoffs
Answer
An index speeds up reads (filtering/sorting) by avoiding full table scans. The trade-off is slower writes (INSERT/UPDATE/DELETE) and extra storage because the index must be maintained.
mediumsqljoininner-join+1
Answer
INNER JOIN returns only rows that match in both tables. LEFT JOIN returns all rows from the left table and fills missing matches from the right side with NULLs.
SELECT u.id, p.id
FROM users u
LEFT JOIN posts p ON p.user_id = u.id;mediumnormalizationdenormalizationschema-design
Answer
Normalization reduces redundancy and update anomalies by splitting data into related tables. Denormalization duplicates some data to speed up reads and reduce joins, at the cost of harder writes and consistency management.
mediumtransactionsisolationconcurrency
Answer
Isolation defines what anomalies are allowed when transactions run concurrently (dirty/non-repeatable reads, phantoms). Higher isolation gives stronger correctness but may reduce concurrency and performance.
hardn-plus-oneormperformance
Answer
It’s when you load N parent rows and then run one extra query per parent (N more queries). Avoid it with joins, batching, prefetching (eager loading), or separate “IN (...)” queries.
hardlockingoptimisticpessimistic+1
Answer
Optimistic locking assumes conflicts are rare: you update with a version/timestamp check and retry on conflict. Pessimistic locking locks rows upfront (e.g., `SELECT ... FOR UPDATE`) to prevent others from changing them.
SELECT *
FROM accounts
WHERE id = 1
FOR UPDATE;hardreplicationshardingscaling
Answer
Replication copies the same data to multiple nodes (better read scale and availability). Sharding splits data across nodes (better write/size scale), but makes queries and transactions more complex.
hardexplainquery-planperformance
Answer
Look at the biggest cost/time nodes first and the scan type (Seq Scan vs Index Scan) and row estimates. Then check joins/sorts and whether indexes are used as expected; fix with better queries, indexes, or stats.
easyforeign-keyconstraintsintegrity
Answer
A foreign key is a constraint that links a column to a referenced primary/unique key in another table. It enforces referential integrity: you can’t reference a row that doesn’t exist (and it can control delete/update behavior).
mediumsqlgroup-byhaving+1
Answer
GROUP BY groups rows for aggregation. HAVING filters groups after aggregation (while WHERE filters rows before grouping). Example: “only groups with count > 10”.
SELECT country, COUNT(*) AS cnt
FROM users
GROUP BY country
HAVING COUNT(*) > 10;mediumindexcovering-indexperformance
Answer
A covering index contains all columns needed by a query, so the database can answer it using only the index (no table/heap lookup). It can significantly speed up reads, at the cost of larger indexes and slower writes.
harddeadlocklockingtransactions+1
Answer
A deadlock is when transactions wait on each other’s locks in a cycle, so none can proceed. Reduce it by keeping transactions short, locking rows in a consistent order, and retrying on deadlock errors.
hardsql-injectionsecurityprepared-statements
Answer
SQL injection is when untrusted input changes the meaning of your SQL (e.g., by concatenating strings). Prevent it with parameterized queries/prepared statements, proper escaping by the driver/ORM, and least-privilege DB accounts.
easyindexcomposite-indexperformance
Answer
A composite index indexes multiple columns together (e.g., (org_id, email)). It helps when queries filter/sort by the leftmost prefix of those columns, reducing scans and speeding up lookups.
mediumsqlwindow-functionsrow_number
Answer
A window function computes a value over a “window” of rows related to the current row without collapsing rows like GROUP BY. Use cases: ranking (ROW_NUMBER), running totals, “top N per group”.
SELECT user_id,
amount,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) AS rn
FROM payments;mediumconstraintstriggersintegrity
Answer
Constraints (PK, FK, UNIQUE, CHECK) are declarative rules enforced by the DB engine. Triggers are custom code that runs on events. Prefer constraints for data integrity when possible because they’re simpler, predictable, and optimized.
hardidempotencyunique-constraintupsert+1
Answer
Use a unique constraint on an idempotency key (or natural key) and upsert/insert-with-conflict-handling. If the same request arrives twice, the second write becomes a no-op or updates the same row instead of creating a duplicate.
hardsoft-deletedata-retentionauditing
Answer
Soft delete marks a row as deleted (e.g., `deleted_at`) so you can restore/audit, but it complicates queries and indexes (you must filter out deleted rows). Hard delete removes data and simplifies queries, but you lose history unless you archive elsewhere.
easysqlbest-practicesperformance+1
Answer
`SELECT *` fetches more data than you need (more I/O and memory) and couples code to schema changes (adding a column can change results, payload size, or expose sensitive fields). Selecting only needed columns is clearer and can enable better index-only plans.
mediumsqlnullthree-valued-logic+1
Answer
NULL means “unknown”, so comparisons like `col = NULL` evaluate to UNKNOWN, not true/false (3-valued logic). Use `IS NULL` / `IS NOT NULL`, and in some DBs `IS DISTINCT FROM` to compare safely with NULLs.
mediumscalingpartitioningsharding+1
Answer
Partitioning splits one logical table into smaller parts, usually within one database system (helps manage data and can speed up queries via partition pruning). Sharding splits data across multiple database nodes/instances to scale out. Partitioning is often simpler; sharding adds distributed complexity.
hardviewsmaterialized-viewperformance+1
Answer
A materialized view stores the result of a query physically, so reads can be much faster than recomputing the query each time. You use it for expensive aggregations or reporting, accepting the trade-off: refresh cost and (often) slightly stale data.
hardperformanceindexeswrite-amplification+1
Answer
Write amplification means one logical write causes multiple physical writes: the table row plus every affected index (and often WAL/redo logs). More indexes usually speed up reads but make inserts/updates/deletes slower and increase storage and maintenance cost.
easyindexesselectivityperformance+1
Answer
Selectivity is how well a column filters rows (how many rows match a predicate). A highly selective index (few matches) is more useful, because the DB can avoid scanning many rows. Low-selectivity columns (like boolean flags) often don’t benefit much from a standalone index.
mediumoptimizerstatisticscardinality+1
Answer
The optimizer picks a plan based on estimated row counts (cardinality). If estimates are wrong (stale stats, skewed data, correlated columns), it can choose the wrong join order or algorithm. Updating statistics (e.g., ANALYZE) and using appropriate indexes helps the optimizer estimate better.
mediumsqllikeindexes+1
Answer
A leading wildcard (`%term`) often prevents using a normal B-tree index, so the DB may scan many rows. Alternatives include full-text search indexes, trigram indexes (where supported), or changing queries to use prefix search (`term%`) when possible.
hardmigrationsschema-changezero-downtime+1
Answer
Use an expand/contract approach: add a new column, backfill data in batches, write to both (or keep them in sync), switch reads to the new column, then remove the old one. This avoids long blocking locks and lets you roll out safely.
hardtransactionslocksmvcc+2
Answer
Long transactions can hold locks for a long time, block other queries, and increase contention. In MVCC databases they can also prevent cleanup of old row versions, leading to bloat. They can also increase replication lag and make failures harder to recover from.
easydatabaseconstraintsprimary-key+1
Answer
A primary key uniquely identifies a row and typically implies NOT NULL plus a single main identifier per table. A unique constraint also enforces uniqueness but you can have multiple unique constraints and they may allow NULLs depending on the DB.
mediumdatabaseforeign-keyintegrity+1
Answer
Foreign keys enforce referential integrity (no orphan rows) and can simplify reasoning. The trade‑off is extra write overhead and sometimes more complex migrations. In high‑throughput systems, teams sometimes validate in the app instead of the DB.
mediumdatabaseindexescomposite-index+1
Answer
Most databases use the left‑most prefix of a composite index. That means an index on (A, B, C) can speed up queries on A or A,B, but not on B alone. Choose the order based on common filters and selectivity.
mediumdatabaseindexescovering-index+1
Answer
A covering index contains all columns needed by a query, so the DB can answer it using only the index without fetching table rows. This reduces I/O and can be much faster, especially on large tables.
mediumdatabasetransactionslocks+1
Answer
A deadlock happens when two transactions wait on each other’s locks. Databases detect it and abort (rollback) one transaction so the other can continue. Apps should retry aborted transactions.
harddatabasetransactionsisolation+1
Answer
Read Committed prevents dirty reads but allows non‑repeatable reads. Repeatable Read guarantees the same rows won’t change during the transaction but can allow phantom rows. Serializable is the strictest; it behaves as if transactions run one‑by‑one, preventing phantoms but reducing concurrency.
harddatabasepartitioningsharding+1
Answer
Partitioning splits a table into parts within a single database instance (often for management/performance). Sharding splits data across multiple database servers for horizontal scaling. Sharding adds complexity around routing and cross‑shard queries.
mediumdatabasedenormalizationperformance+1
Answer
Denormalization duplicates data to speed up reads and reduce joins. It can improve performance for read‑heavy workloads, but increases storage, risks inconsistency, and makes writes/migrations more complex.
easydatabaseoltpolap+1
Answer
OLTP systems handle many small, short transactions (operational workloads). OLAP systems run fewer, heavy analytical queries over large datasets (reporting/BI). Schema design, indexing, and storage often differ.
easydatabasetransactionsautocommit+1
Answer
With autocommit, each statement is its own transaction. Explicit transactions group multiple statements into one atomic unit, which matters for consistency and performance (fewer commits/round‑trips). Use explicit transactions for multi‑step changes that must succeed or fail together.