Databasesmedium

Why can the optimizer choose a bad query plan and how do statistics help?

Answer

The optimizer picks a plan based on estimated row counts (cardinality). If estimates are wrong (stale stats, skewed data, correlated columns), it can choose the wrong join order or algorithm. Updating statistics (e.g., ANALYZE) and using appropriate indexes helps the optimizer estimate better.

Advanced answer

Deep dive

Expanding on the short answer — what usually matters in practice:

Context (tags): optimizer, statistics, cardinality, performance
Data model and access patterns: dominant queries (read/write ratio, sorting, pagination).
Indexes: when they help vs hurt (write amplification, memory).
Consistency & transactions: what’s guaranteed and what can bite you.
Explain the "why", not just the "what" (intuition + consequences).
Trade-offs: what you gain/lose (time, memory, complexity, risk).
Edge cases: empty inputs, large inputs, invalid inputs, concurrency.

Examples

A tiny example (query shape):

-- Example: index + query shape
SELECT *
FROM users
WHERE email = '[email protected]'
LIMIT 1;

Common pitfalls

Too generic: no concrete trade-offs or examples.
Mixing average-case and worst-case (e.g., complexity).
Ignoring constraints: memory, concurrency, network/disk costs.

Interview follow-ups

When would you choose an alternative and why?
What production issues show up and how do you diagnose them?
How would you test edge cases?

Why can the optimizer choose a bad query plan and how do statistics help?

Answer

Advanced answer

Deep dive

Examples

Common pitfalls

Interview follow-ups

Related questions

Why can the optimizer choose a bad query plan and how do statistics help?

Answer

Advanced answer

Deep dive

Examples

Common pitfalls

Interview follow-ups

Related questions