Architecture Patterns
Arranging the building blocks into a system that lasts
You now have every component: servers, databases, caches, load balancers, replicas. A pile of parts isn't an architecture, though. The last skill is arrangement: how to pull these pieces apart so teams can move independently, keep them from blocking each other, and still hand the outside world one coherent system. That's harder than it sounds.
This last group is about structure at the largest scale: when to split a monolith into microservices, how message queues let services stop waiting on each other, how rate limiting protects you from overload, and how an API gateway ties it all behind one door.
Microservices vs the monolith, message queues for async decoupling, rate limiting to survive abuse, and the API gateway as a single entry point. Every concept ends with a QA testing lens: how a tester would probe or break it.
Microservices — split the monolith (when it earns it)
As an app grows, a single codebase (a monolith) gets harder to maintain, deploy, and scale. Microservices break it into small, independent services — each owning one business capability and its own database.
- Monolith
- Microservices
One deploy, one database, easy local development and joins. Simple — until the codebase and the team both get too big to move quickly.
Each service deploys, scales, and fails independently — and different teams can own different services in different languages.
The upside is independence: the payments service getting hammered? Scale only it. A bug in inventory? Fix and ship without touching anything else. The downside is equally real:
Independent deploys, scaling, and team ownership
A fault can be contained to one service
Distributed-systems complexity — network calls replace function calls
No shared database means no easy joins; consistency across services is hard
Operational overhead — monitoring and debugging dozens of services
Default to a monolith. Move to microservices only when you have a concrete problem they solve — usually team scaling or wildly different scaling needs per component. Microservices for 200 users is the textbook anti-pattern.
"Microservices are more scalable than monoliths." A well-built monolith behind a load balancer scales horizontally just fine — run more copies. What microservices actually buy is independent scaling (scale only the payments service) and team scaling (twenty teams deploying without stepping on each other). If you don't have those two problems, you're paying the distributed-systems tax for benefits you can't use.
QA Lens Microservices shift the hard bugs into the gaps between services. Unit tests per service pass while the system is still broken end-to-end — so invest in contract tests (does Orders still send what Payments expects?) and a few full integration flows. Test the partial-failure case: Payments succeeds but Inventory is down — does the order end up in a coherent state, or stuck halfway?
Message Queues — stop waiting on each other
Synchronous calls couple services tightly: if service A calls service B and B is slow or down, A suffers too. A message queue inserts a buffer between them — A drops a message and moves on; B processes it whenever it's ready.
When a user places an order, the Order Service shouldn't block while an email sends, inventory updates, and analytics records. It drops one message on the queue and returns instantly; each consumer works at its own pace. If the email service is down for a minute, messages wait in the queue and get processed when it recovers.
The usual suspects: Kafka (high-throughput event streaming), RabbitMQ (a classic message broker), and SQS (managed by AWS).
QA Lens Async means your test can't just assert on the immediate response — the work finishes later. Test the eventual outcome, then the failure modes the queue exposes: a poison message that crashes the consumer on every retry, duplicate delivery (most queues are at-least-once, so consumers must be idempotent), and a consumer falling behind until the backlog grows unbounded.
Rate Limiting — protect yourself from too much love
Whether your API is public or internal, you need a cap on how many requests a client can make in a window. Rate limiting prevents abuse, shields expensive operations, and enforces fair usage.
It usually lives at the API gateway, backed by a fast store like Redis to track per-client counts. A few algorithms you should be able to name:
| Algorithm | How it works |
|---|---|
| Token Bucket | Tokens refill at a fixed rate; each request spends one. Allows bursts. |
| Sliding Window | Count requests in a rolling time window. Smooth and accurate. |
| Fixed Window | Count per discrete interval (per minute). Simple, but bursty at edges. |
Why it matters: it blunts DDoS attacks, protects costly queries, enforces usage tiers (free users get 100 req/min, paid get 10,000), and stops one misbehaving client from ruining everyone's day.
QA Lens
Test both sides of the limit: the request at N succeeds, the one at N+1
gets a clean 429 (with a Retry-After header, ideally) — not a
500 or a hang. In a horizontally scaled setup, verify the counter is
shared across instances; a per-server limiter silently grants N ×
servers requests, defeating the whole purpose.
API Gateway — one front door for everything
Without a gateway, every client must know the address of every service, and every service must re-implement auth, rate limiting, and logging. An API gateway is a single entry point that handles those cross-cutting concerns once and routes each request to the right service.
The gateway centralizes what would otherwise be duplicated everywhere, and simplifies the client: instead of juggling twenty service addresses, it talks to one URL and the gateway routes by path. Beyond routing, gateways often add:
Authentication and rate limiting in one place
Request/response transformation between protocols or formats
Response aggregation — combine several services into one reply
Caching and circuit breaking for failing services
Common options: Kong, AWS API Gateway, and Nginx.
A reverse proxy forwards and balances traffic. An API gateway is a reverse proxy plus application-aware concerns — auth, rate limiting, aggregation, transformation. Every gateway is a reverse proxy; not every reverse proxy is a gateway.
QA Lens The gateway is a single point that touches every request — so it's a single point of failure too. Test that auth is enforced at the edge for every route (one unprotected path leaks the whole backend). Verify the gateway itself is highly available, and that circuit breaking works: when one downstream service dies, the gateway fails fast instead of letting requests pile up and drag everything down.
The Whole Picture
These concepts aren't a checklist to bolt on — they're a vocabulary for reasoning about trade-offs. A mature design borrows from every group:
Test Yourself
Answer from memory first, then expand to check.
Q1. Every unit test in every microservice passes, yet placing an order fails in production. Where do the bugs live, and what kind of test catches them?
In the gaps between services — Orders sending a payload Payments no longer expects, or a partial failure leaving an order half-completed. Contract tests pin the inter-service payloads; a few end-to-end integration flows catch the rest. Per-service tests can't see either.
Q2. An order triggers an email, an inventory update, and an analytics event. Why should none of these block the order response — and what makes that safe?
The user is waiting on "order placed," not on the email. The Order Service publishes one message to a queue and returns; each consumer processes at its own pace, and messages wait out a consumer outage. The safety requirement: queues deliver at least once, so every consumer must be idempotent or you'll send three confirmation emails.
Q3. Your rate limiter allows 100 req/min per client, and you run 8 app servers. A client is getting 800 requests through. What happened?
Each server is counting independently — the limit is per-instance, not shared. The counter must live in a shared store (Redis) so all instances see the same count. This is also why rate limiting usually lives at the gateway, where there's one choke point.
Q4. What does an API gateway do that a plain reverse proxy doesn't — and what risk does centralizing all that create?
A reverse proxy forwards and balances traffic; a gateway adds application-aware concerns — auth, rate limiting, request transformation, response aggregation. The risk: it touches every request, so it's a single point of failure (make it highly available) and a single point of bypass — one route that skips auth at the gateway exposes the whole backend.
Quick Revision
Independent services, own databases. Default to a monolith; split only when it earns it.
Async decoupling + buffering + reliability. Consumers must be idempotent. Kafka, RabbitMQ, SQS.
Token bucket / sliding window at the gateway. Shared counter across instances. Returns 429.
One entry point: auth, routing, rate limiting, aggregation. A reverse proxy that knows your app.
"Clients hit one API gateway for auth and rate limiting. It routes to services that talk async over a queue, so a slow email never blocks an order. I'd keep it a modular monolith until team or scaling pressure justifies real microservices." That's senior-level restraint.
Where to Next
You've walked every core concept. Time to use them all at once:
- Capstone: Design a URL Shortener — apply the full toolkit to one real design, end to end
- The Cheat Sheet — every concept on one printable page
- What is System Design? — the trade-off mindset behind every choice