With many instances, an in-memory counter limits only one instance, so overall traffic can exceed the limit. You usually need a shared store (e.g., Redis) or enforce limits at the gateway. Hard parts: correctness under concurrency, time windows, clock drift, and avoiding hot keys.
Expanding on the short answer — what usually matters in practice:
A tiny example (an explanation template):
// Example: discuss trade-offs for "distributed-rate-limiting:-why-is-it-harder-than"
function explain() {
// Start from the core idea:
// With many instances, an in-memory counter limits only one instance, so overall traffic can
}