Course → Module 2: Scalability, Load Balancing & API Design

What Makes an API "RESTful"?

REST (Representational State Transfer) is an architectural style, not a protocol. Roy Fielding defined it in his 2000 doctoral dissertation. The core idea: treat everything as a resource, identify resources with URLs, and manipulate them with a small, fixed set of HTTP methods.

A RESTful API organizes endpoints around nouns (resources), not verbs (actions). You do not create an endpoint called /createUser. You create a resource at /users and use the HTTP method POST to create a new one.

Resource-oriented design: Every entity in your system (user, order, bookmark, comment) is a resource with a unique URL. The HTTP method tells the server what to do with it. The URL tells the server which resource you mean.

HTTP Methods, Idempotency, and Safety

HTTP defines a small set of methods. Each has specific semantics that clients and intermediaries (proxies, caches, load balancers) rely on.

Method Purpose Idempotent? Safe? Request Body?
GET Retrieve a resource Yes Yes No
POST Create a new resource No No Yes
PUT Replace a resource entirely Yes No Yes
PATCH Partially update a resource No* No Yes
DELETE Remove a resource Yes No Optional

*PATCH is not guaranteed to be idempotent, though it can be implemented that way.

Idempotent means calling the same request multiple times produces the same server state as calling it once. PUT /users/42 with the same body always results in the same user record, no matter how many times you call it. POST /users creates a new user each time, so it is not idempotent.

Safe means the method does not modify server state. GET is safe. DELETE is not.

Idempotency matters because networks are unreliable. If a client sends a PUT request and the connection drops before it receives the response, it can safely retry. The server ends up in the same state. If the same thing happens with a non-idempotent POST, the retry might create a duplicate resource.

PUT vs. POST

The distinction is often misunderstood. POST means "create a new resource; the server assigns the ID." PUT means "place this resource at this exact URL." If the resource exists, PUT replaces it. If it does not, PUT creates it at the specified URL.

POST /bookmarks        → Server creates bookmark, assigns ID 789
PUT  /bookmarks/789    → Client specifies the exact resource to create or replace

Status Codes

HTTP status codes communicate what happened. Use them correctly. Clients, monitoring tools, and retry logic all depend on them.

Range Meaning Common Codes
2xx Success 200 OK, 201 Created, 204 No Content
3xx Redirection 301 Moved Permanently, 304 Not Modified
4xx Client error 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 Conflict, 429 Too Many Requests
5xx Server error 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable

A few rules: return 201 (not 200) when creating a resource. Return 204 when a DELETE succeeds and there is no body to return. Return 409 Conflict when the client tries to create something that already exists. Never return 200 with an error message in the body. That defeats the purpose of status codes.

API Versioning

APIs evolve. Fields get added, renamed, or removed. You need a strategy for changing the API without breaking existing clients.

Three common approaches:

URL path versioning is the most common choice for public APIs because it is explicit and requires no special client configuration. Use it unless you have a specific reason not to.

Rate Limiting

Every public API needs rate limiting. Without it, a single misbehaving client (or attacker) can consume all your server resources.

Rate limiting caps the number of requests a client can make in a time window. Common implementations use a token bucket or sliding window algorithm. When a client exceeds the limit, the server returns 429 Too Many Requests with a Retry-After header.

Rate limits are typically defined per API key or per IP address, and expressed as requests per second or per minute. Session 2.9 covers rate limiting algorithms in depth.

Pagination: Cursor vs. Offset

Any endpoint that returns a list of resources needs pagination. Returning 10 million bookmarks in a single response is not an option. Two approaches dominate.

Offset-Based Pagination

GET /bookmarks?offset=20&limit=10

The server skips the first 20 records and returns the next 10. Simple to implement. The client just increments the offset by the page size.

The problem: offset pagination breaks with mutable data. If a new bookmark is inserted while the client is paginating, records shift. The client might see duplicates or skip items. It also performs poorly at large offsets because the database still has to scan and discard all skipped rows.

Cursor-Based Pagination

GET /bookmarks?cursor=eyJpZCI6MTAwfQ&limit=10

The cursor is an opaque token (usually an encoded record ID or timestamp) pointing to a specific position in the dataset. The server returns records after that position. The response includes a next_cursor for the client to use in the next request.

Cursor pagination is stable even when data changes between requests. It is also efficient because the database can use an index to jump directly to the cursor position instead of scanning from the beginning.

sequenceDiagram participant C as Client participant S as Server C->>S: GET /bookmarks?limit=10 S-->>C: 200 OK {data: [...10 items], next_cursor: "abc123"} C->>S: GET /bookmarks?cursor=abc123&limit=10 S-->>C: 200 OK {data: [...10 items], next_cursor: "def456"} C->>S: GET /bookmarks?cursor=def456&limit=10 S-->>C: 200 OK {data: [...3 items], next_cursor: null} Note over C,S: next_cursor is null, no more pages

The tradeoff is that cursor pagination does not support "jump to page 5." The client can only move forward (or backward, if you provide a previous cursor). For most API use cases, this is acceptable. For user-facing interfaces where page numbers matter, offset is sometimes still preferred despite its limitations.

Designing Good Endpoints

A few conventions that make APIs predictable and easy to use:

Further Reading

Assignment

Design the API for a bookmarking service. Users can save URLs with tags, list their bookmarks, delete bookmarks, and search by tag or keyword. For each operation, specify:

  1. The HTTP method and endpoint URL
  2. The request body (if any), as JSON
  3. The response body, as JSON
  4. The HTTP status code for success

Operations to design:

  • Create a bookmark: Save a URL with a title, description, and list of tags.
  • List bookmarks: Return the current user's bookmarks, paginated. Choose cursor or offset pagination and justify your choice.
  • Delete a bookmark: Remove a bookmark by its ID.
  • Search bookmarks: Find bookmarks matching a keyword or tag. Should this be a separate endpoint or a query parameter on the list endpoint? Explain your reasoning.

Bonus: How would you make the "create bookmark" operation idempotent? What would you use as the idempotency key?