Skip to main content

APIs & Communication

The contracts that let two programs trust each other

A frontend engineer and a backend engineer have never spoken. They sit in different time zones, ship on different schedules, and write in different languages. Yet their code has to interlock perfectly on launch day. The only thing holding them together is a contract — an API.

The networking layer got bytes from one machine to another. This layer decides what those bytes mean. We'll walk the four styles you'll actually argue about in a design review (REST, GraphQL, WebSockets, Webhooks) and, just as important, when to reach for each.

What this page covers

APIs as a contract, then the four styles you'll defend in any design review: REST, GraphQL, WebSockets, and Webhooks. Every concept ends with a QA testing lens: how a tester would probe or break it.


APIs — the menu, not the kitchen

An API (Application Programming Interface) is a contract between two pieces of software. It declares what you can ask for, how to ask, and what comes back — and deliberately hides everything else.

A request through the API layer
requestresponse· hover a node to trace it
JSON requestJSON response
Client App
API Layerendpoints + auth
Validation
Business Logic
Database
The client only ever sees the contract at the API layer; validation, logic, and storage stay hidden.
The restaurant analogy — click to expand

You don't walk into the kitchen and cook. You read the menu (the API docs), place an order (the request), and your food arrives (the response). You neither know nor care whether the kitchen uses a gas range or induction — that's an implementation detail behind the contract. A good API is a good menu: complete, unambiguous, and stable enough that you can reorder tomorrow and get the same dish.

The contract is what makes large systems buildable by many teams at once. Change the kitchen all you like; as long as the menu holds, every diner is fine.

QA Lens If the API is a contract, then the contract — not the implementation — is what you test. Generate tests from the spec (OpenAPI/Swagger) so documentation and behavior can't drift apart: a field documented as required but accepted as missing is a bug even though nothing crashes. And probe what the menu doesn't offer — undocumented endpoints and verbs that "work anyway" are unowned attack surface.


REST — resources and verbs

REST (Representational State Transfer) is the default API style of the web. Its core trick: treat everything as a resource with a URL, and use standard HTTP verbs to act on it. It maps almost one-to-one onto CRUD.

You want to…HTTP methodExample
List a collectionGETGET /users
Read one itemGETGET /users/123
CreatePOSTPOST /users
Replace / updatePUT / PATCHPUT /users/123
RemoveDELETEDELETE /users/123

The three principles worth memorizing: REST is stateless (no per-client session on the server), cacheable (responses can be reused), and offers a uniform interface (predictable URL patterns). That predictability is REST's superpower — and the root of its one real weakness.

REST's pain point

If a screen needs a user and their posts and their followers, that's often three round trips to three endpoints. On a slow mobile network, three sequential requests is three chances to feel slow. This is exactly the gap GraphQL was built to close.

QA Lens REST's predictability makes it a joy to test — but verify the verbs mean what they say. GET must be safe (never mutates), PUT and DELETE must be idempotent (call twice, same result). A DELETE that errors on the second call because the row is "already gone" is a bug: it should return 204 or 404, not 500.


GraphQL — ask for exactly what you need

GraphQL, born at Facebook, flips the model: instead of many fixed endpoints, there's one endpoint and the client describes the exact shape of data it wants. The server resolves it and returns precisely that — no more, no less.

GraphQL — one query, many resolvers
requestresponse· hover a node to trace it
one queryexact fields
Clientsingle query
GraphQL Server
Users Service
Posts Service
Comments Service
One client query fans out to three services, yet only the requested fields travel back over the wire.

Where REST might need /users/123 and /users/123/posts, GraphQL folds both into one query. That kills over-fetching (downloading fields you ignore) and under-fetching (needing more calls). But nothing is free:

REST vs GraphQL — pick per use case, not per religion
RESTresource-oriented
  • Dead simple, universally understood
  • HTTP caching works out of the box
  • Easy to monitor per-endpoint
  • Can over-/under-fetch for rich screens
Best forThe safe default for most services and public APIs.
GraphQLquery-oriented
  • One request, exactly the fields you need
  • Great for varied clients (web, mobile, TV)
  • Caching is harder — every query differs
  • Resolvers add server complexity; deep nesting can hurt
Best forFlexible, client-driven data like social feeds.

For most system design problems, REST is the safer default. Reaching for GraphQL signals breadth — just be ready to explain the caching and resolver costs you're taking on.

Common misconception

"GraphQL is REST v2 — newer, so better." GraphQL isn't an upgrade; it's a different trade-off. It moves complexity from the client (multiple calls, over-fetching) to the server (resolvers, query-depth limits, bespoke caching). If your API is straightforward CRUD with one or two client types, GraphQL adds cost and removes free HTTP caching — for nothing in return.

QA Lens GraphQL widens the test surface in a sneaky way: the schema is the contract, so a renamed field breaks clients even though the endpoint URL never changed. Test the schema itself, watch for N+1 query explosions behind innocent-looking nested fields, and cap query depth so a single malicious request can't ask for friends-of-friends-of-friends forever.


WebSockets — when the server needs to talk first

REST and GraphQL are both client-initiated: the client asks, the server answers, the line goes quiet. But chat apps, live scores, and collaborative editors need the server to push the instant something changes. That's WebSockets.

WebSocket — upgrade once, then talk freely
0 / 7
HTTP upgrade request101 Switching ProtocolsPersistent, bidirectional channelsend messagenew message pushedlive score updateConnection stays open until closed
Client
Server
Client
Server
After the 101 upgrade the server can push at any time; plain HTTP would need the client to ask first.

A WebSocket starts life as an ordinary HTTP request, then upgrades into a persistent, two-way pipe. After that, either side can send at any time with no fresh-connection overhead — perfect for real-time features.

The catch is state. Each open socket lives on a specific server, which must track it. That makes scaling harder and failures brutal:

True real-time, low-overhead, bidirectional
Ideal for chat, presence, live dashboards, multiplayer
Stateful — the server remembers every connection
Server dies → every socket on it drops at once

QA Lens Stateful connections demand chaos testing the happy path never reveals. Kill the server mid-session: does the client reconnect and resync missed messages, or silently lose them? Test flaky networks, duplicate deliveries, and out-of-order messages. "It works on localhost" means nothing here — the bugs live in the reconnection logic.


Webhooks — "don't call us, we'll call you"

Polling ("anything new yet? ...now? ...now?") is wasteful. Webhooks invert it: you register a callback URL once, and the other system POSTs to it the moment an event happens. It's how Stripe tells you a payment cleared and how GitHub triggers your CI on push.

Webhook — Stripe calls you when something happens
0 / 7
register webhook URLlater, an event occurs...POST /hook { payment.completed }200 OK (acknowledged)another event...POST /hook { payment.failed }200 OK
Your App
Stripe
Your App
Stripe
Your app registers once and then just listens; the 200 acknowledgment is what stops Stripe retrying.

The hard part isn't receiving the call — it's reliability. What if your server was down when the webhook fired? Production-grade webhook systems lean on three safeguards:

Retries
Re-deliver with backoff until the receiver returns 2xx, so a brief outage doesn’t lose the event.
Idempotency
Each event carries a unique ID so a retried delivery is processed once, not twice.
Event log
Persist every received event so you can replay, audit, and debug after the fact.
Webhooks vs WebSockets — don't confuse them

WebSockets keep a live channel open between a client and a server (chat). Webhooks are one-off server-to-server HTTP callbacks for events (Stripe → your backend). Different problems, similar-sounding names.

QA Lens Webhooks fail in the gaps you can't see locally. Force a retry by returning 500 once and confirm the event is delivered at least once but processed exactly once — exactly-once delivery is impossible over a network, which is why the idempotency key exists. Verify the signature header (an unsigned webhook endpoint is an open door for spoofed events). And always test the "we were down for 10 minutes" replay — that's the scenario that pages someone at 3am.


Choosing a Style

1
Request/Reply?
default
2
Flexible shape?
rich clients
3
Live two-way?
real-time
4
Event callback?
async
Work down the questions in order; REST is the default and every later style must earn its complexity.

Test Yourself

Answer from memory first, then expand to check.

Q1. Calling DELETE /users/123 twice returns 204 then 500. Which REST guarantee is broken, and what should the second call return?

Idempotency. DELETE must be safe to repeat — the second call should return 204 (or 404), because "the row is already gone" is the successful outcome of a delete, not an error.

Q2. A mobile screen needs a user, their posts, and their followers. When does this justify GraphQL — and when doesn't it?

It justifies GraphQL when many different clients each need different field combinations and the round trips genuinely hurt (slow mobile networks). It doesn't when one backend-for-frontend endpoint or a composite REST endpoint would do — GraphQL's price is resolver complexity and losing plain HTTP caching.

Q3. Chat messages and Stripe payment confirmations are both "real-time events." Why does one use WebSockets and the other webhooks?

Chat is client ↔ server with a live, long-lived, two-way conversation — that's a WebSocket. A payment confirmation is a server → server one-off notification — that's a webhook (an HTTP callback with retries and an idempotency key). The receiver of a webhook doesn't hold a connection open; it just exposes a URL.

Q4. Your webhook endpoint went down for 10 minutes. What three mechanisms decide whether you lost money?

Retries (the sender re-delivers with backoff until you return 2xx), idempotency keys (so re-delivered events aren't processed twice), and an event log (so you can replay and audit what was received). Missing any one of the three turns a brief outage into lost or duplicated events.


Quick Revision

API

A contract: what you can ask, how, and what returns. The menu, not the kitchen.

REST

Resources + HTTP verbs. Stateless, cacheable, predictable. The safe default.

GraphQL

One query, exact fields. Kills over-fetching; harder to cache, resolver cost.

WebSockets

Persistent two-way channel for real-time. Stateful, so scaling is the hard part.

Webhooks

Server-to-server event callbacks. Needs retries, idempotency, and an event log.

In an interview

"It's read-heavy CRUD, so REST with HTTP caching. The live presence feature needs WebSockets, and we'll take payment confirmations from Stripe over a signed, idempotent webhook." One sentence, four correct tools.


Where to Next

Contracts move data around — but where does the data actually live?