Architecture Patterns

Arranging the building blocks into a system that lasts

You now have every component: servers, databases, caches, load balancers, replicas. A pile of parts isn't an architecture, though. The last skill is arrangement: how to pull these pieces apart so teams can move independently, keep them from blocking each other, and still hand the outside world one coherent system. That's harder than it sounds.

This last group is about structure at the largest scale: when to split a monolith into microservices, how message queues let services stop waiting on each other, how rate limiting protects you from overload, and how an API gateway ties it all behind one door.

What this page covers

Microservices vs the monolith, message queues for async decoupling, rate limiting to survive abuse, and the API gateway as a single entry point. Every concept ends with a QA testing lens: how a tester would probe or break it.

Microservices — split the monolith (when it earns it)

As an app grows, a single codebase (a monolith) gets harder to maintain, deploy, and scale. Microservices break it into small, independent services — each owning one business capability and its own database.

Monolith
Microservices

Monolith — one codebase, one database

requestresponse· hover a node to trace it

One deploy and one database keep everything simple — until the codebase and team grow too big to move quickly.

One deploy, one database, easy local development and joins. Simple — until the codebase and the team both get too big to move quickly.

The upside is independence: the payments service getting hammered? Scale only it. A bug in inventory? Fix and ship without touching anything else. The downside is equally real:

Independent deploys, scaling, and team ownership
A fault can be contained to one service
Distributed-systems complexity — network calls replace function calls
No shared database means no easy joins; consistency across services is hard
Operational overhead — monitoring and debugging dozens of services

Don't start here

Default to a monolith. Move to microservices only when you have a concrete problem they solve — usually team scaling or wildly different scaling needs per component. Microservices for 200 users is the textbook anti-pattern.

Common misconception

"Microservices are more scalable than monoliths." A well-built monolith behind a load balancer scales horizontally just fine — run more copies. What microservices actually buy is independent scaling (scale only the payments service) and team scaling (twenty teams deploying without stepping on each other). If you don't have those two problems, you're paying the distributed-systems tax for benefits you can't use.

QA Lens Microservices shift the hard bugs into the gaps between services. Unit tests per service pass while the system is still broken end-to-end — so invest in contract tests (does Orders still send what Payments expects?) and a few full integration flows. Test the partial-failure case: Payments succeeds but Inventory is down — does the order end up in a coherent state, or stuck halfway?

Message Queues — stop waiting on each other

Synchronous calls couple services tightly: if service A calls service B and B is slow or down, A suffers too. A message queue inserts a buffer between them — A drops a message and moves on; B processes it whenever it's ready.

A queue decouples producer from consumers

requestresponse· hover a node to trace it

The producer publishes once and returns instantly; each consumer drains at its own pace, even after an outage.

When a user places an order, the Order Service shouldn't block while an email sends, inventory updates, and analytics records. It drops one message on the queue and returns instantly; each consumer works at its own pace. If the email service is down for a minute, messages wait in the queue and get processed when it recovers.

What a queue buys you

Decoupling

Producers and consumers don’t need to know about each other.

Buffering

Absorb traffic spikes — the queue soaks up bursts.

Reliability

Messages persist even if a consumer crashes mid-work.

Scalability

Add more consumers to drain the queue faster.

The usual suspects: Kafka (high-throughput event streaming), RabbitMQ (a classic message broker), and SQS (managed by AWS).

QA Lens Async means your test can't just assert on the immediate response — the work finishes later. Test the eventual outcome, then the failure modes the queue exposes: a poison message that crashes the consumer on every retry, duplicate delivery (most queues are at-least-once, so consumers must be idempotent), and a consumer falling behind until the backlog grows unbounded.

Rate Limiting — protect yourself from too much love

Whether your API is public or internal, you need a cap on how many requests a client can make in a window. Rate limiting prevents abuse, shields expensive operations, and enforces fair usage.

Under the limit, or rejected

requestresponse· hover a node to trace it

Note the dashed path: over-limit requests get a clean 429 and never reach the backend at all.

It usually lives at the API gateway, backed by a fast store like Redis to track per-client counts. A few algorithms you should be able to name:

Algorithm	How it works
Token Bucket	Tokens refill at a fixed rate; each request spends one. Allows bursts.
Sliding Window	Count requests in a rolling time window. Smooth and accurate.
Fixed Window	Count per discrete interval (per minute). Simple, but bursty at edges.

Why it matters: it blunts DDoS attacks, protects costly queries, enforces usage tiers (free users get 100 req/min, paid get 10,000), and stops one misbehaving client from ruining everyone's day.

QA Lens Test both sides of the limit: the request at N succeeds, the one at N+1 gets a clean 429 (with a Retry-After header, ideally) — not a 500 or a hang. In a horizontally scaled setup, verify the counter is shared across instances; a per-server limiter silently grants N × servers requests, defeating the whole purpose.

API Gateway — one front door for everything

Without a gateway, every client must know the address of every service, and every service must re-implement auth, rate limiting, and logging. An API gateway is a single entry point that handles those cross-cutting concerns once and routes each request to the right service.

One front door for every client

requestresponse· hover a node to trace it

Every client talks to one address; auth, rate limiting, and routing happen once instead of in every service.

The gateway centralizes what would otherwise be duplicated everywhere, and simplifies the client: instead of juggling twenty service addresses, it talks to one URL and the gateway routes by path. Beyond routing, gateways often add:

Authentication and rate limiting in one place
Request/response transformation between protocols or formats
Response aggregation — combine several services into one reply
Caching and circuit breaking for failing services

Common options: Kong, AWS API Gateway, and Nginx.

Gateway vs reverse proxy

A reverse proxy forwards and balances traffic. An API gateway is a reverse proxy plus application-aware concerns — auth, rate limiting, aggregation, transformation. Every gateway is a reverse proxy; not every reverse proxy is a gateway.

QA Lens The gateway is a single point that touches every request — so it's a single point of failure too. Test that auth is enforced at the edge for every route (one unprotected path leaks the whole backend). Verify the gateway itself is highly available, and that circuit breaking works: when one downstream service dies, the gateway fails fast instead of letting requests pile up and drag everything down.

The Whole Picture

These concepts aren't a checklist to bolt on — they're a vocabulary for reasoning about trade-offs. A mature design borrows from every group:

Front door

gateway + LB

Decouple

queues

Split

services

Store + scale

data layer

Protect

limits + idempotency

A mature design borrows from every group — each step adds complexity only where real pressure earns it.

Test Yourself

Answer from memory first, then expand to check.

Q1. Every unit test in every microservice passes, yet placing an order fails in production. Where do the bugs live, and what kind of test catches them?

In the gaps between services — Orders sending a payload Payments no longer expects, or a partial failure leaving an order half-completed. Contract tests pin the inter-service payloads; a few end-to-end integration flows catch the rest. Per-service tests can't see either.

Q2. An order triggers an email, an inventory update, and an analytics event. Why should none of these block the order response — and what makes that safe?

The user is waiting on "order placed," not on the email. The Order Service publishes one message to a queue and returns; each consumer processes at its own pace, and messages wait out a consumer outage. The safety requirement: queues deliver at least once, so every consumer must be idempotent or you'll send three confirmation emails.

Q3. Your rate limiter allows 100 req/min per client, and you run 8 app servers. A client is getting 800 requests through. What happened?

Each server is counting independently — the limit is per-instance, not shared. The counter must live in a shared store (Redis) so all instances see the same count. This is also why rate limiting usually lives at the gateway, where there's one choke point.

Q4. What does an API gateway do that a plain reverse proxy doesn't — and what risk does centralizing all that create?

A reverse proxy forwards and balances traffic; a gateway adds application-aware concerns — auth, rate limiting, request transformation, response aggregation. The risk: it touches every request, so it's a single point of failure (make it highly available) and a single point of bypass — one route that skips auth at the gateway exposes the whole backend.

Quick Revision

Microservices

Independent services, own databases. Default to a monolith; split only when it earns it.

Message Queues

Async decoupling + buffering + reliability. Consumers must be idempotent. Kafka, RabbitMQ, SQS.

Rate Limiting

Token bucket / sliding window at the gateway. Shared counter across instances. Returns 429.

API Gateway

One entry point: auth, routing, rate limiting, aggregation. A reverse proxy that knows your app.

In an interview

"Clients hit one API gateway for auth and rate limiting. It routes to services that talk async over a queue, so a slow email never blocks an order. I'd keep it a modular monolith until team or scaling pressure justifies real microservices." That's senior-level restraint.

Where to Next

You've walked every core concept. Time to use them all at once:

Capstone: Design a URL Shortener — apply the full toolkit to one real design, end to end
The Cheat Sheet — every concept on one printable page
What is System Design? — the trade-off mindset behind every choice

Microservices — split the monolith (when it earns it)​

Message Queues — stop waiting on each other​

Rate Limiting — protect yourself from too much love​

API Gateway — one front door for everything​

The Whole Picture​

Test Yourself​

Quick Revision​

Where to Next​

Microservices — split the monolith (when it earns it)

Message Queues — stop waiting on each other

Rate Limiting — protect yourself from too much love

API Gateway — one front door for everything

The Whole Picture

Test Yourself

Quick Revision

Where to Next