Engineering Notes

Short technical thoughts on reliability engineering and backend design decisions.

Why idempotency matters

Retries are normal in real networks. If a create endpoint is not idempotent, users pay with duplicates.

Application locks reduce contention, but database constraints are still the final safety net for correctness.

For cache reads, fail-open can preserve availability. For critical keyspaces, fail-closed protects integrity.

Public callbacks must verify signature, source, freshness, and replay keys before any state transition.

Cache outage should reduce performance, not correctness. Always preserve a reliable source-of-truth path.

Version bump invalidation avoids expensive wildcard deletes and reduces cache-stampede pressure.

Rate limiting that is too coarse blocks healthy traffic; too weak allows request storms to bypass safeguards.

Incorrect transaction boundaries can commit partial states. Rollback conditions must match real failure cases.

Bounded queues, overflow buffers, and adaptive batch sizing keep ingestion stable during traffic spikes.

Reserve quota before bulk sends and release unused amounts later to avoid oversubscription under concurrency.

Long-running services should persist critical runtime state so recovery after crash is predictable.

If internal identity headers can be set by clients, microservice auth breaks. Strip and rewrite trusted headers only at the gateway.

Registration and user-order APIs have different risk/traffic profiles; separate policies prevent either abuse gaps or accidental throttling.

Retries and circuit breakers should fail with deterministic degraded responses so upstream clients get predictable behavior.