Process-driven REST API design
REST is a heavily oversimplified and massively misunderstood (but very widely used) concept for designing APIs. Just look for any forum thread that features Roy Fielding (the “founder” of REST) and you will probably find him telling the author and other participants how they misunderstood what he intended REST to be. Which is perhaps no surprise, as his original paper is too long to read for many, and the ideas are not completely trivial.
So instead of REST, we usually get something almost, but not quite, entirely unlike REST.
REST is often simplified to CRUD because contrary to REST, CRUD is simple and well-understood. And perhaps because they are both four letter abbreviations that contain an R, who knows.
With CRUD, your operations are Create, Read, Update and Delete. These can be conveniently mapped to the four main HTTP verbs: POST, GET, PUT and DELETE. Now you just need something to perform these operations on, so let’s take your data entities from your database, convert them into URLs, call them “resources”, and presto, API! And it’s RESTful because there are HTTP verbs and resources, so we can move the Trello card to Done and go for lunch.
I don’t think this is proper API design, and definitely not RESTful. You didn’t create a REST API; you just created an ORM with only four operations and the latency of HTTP. There is no Representational State Transfer (REST) going on, not really. It’s merely a simple way of leveraging HTTP to access your data, so simple even that there are frameworks that can automatically generate such an API for you from your database entities.
I call this approach data-driven API design. What I want to do however is to describe the method of designing REST APIs that I use, which is supported by quite some literature and good experiences. I feel like this is a lot more like how REST was intended to be, but because of the confusion regarding the terminology, I tend to call this process-driven REST API design.
Before we discuss this method, however, let’s first consider the downsides of data-driven API design.
The pitfalls of data-driven API design
Consider a simple webshop. There are products and customers who can place orders for these products. When an order is placed, it is created as an order with one or more order items, with each order item representing a product.
Through the data-driven approach, you will get the following resources for retrieving product information:
GET /products/{uuid}
Response:
- uuid (string)
- name (string)
- price (float)
GET /products
Response:
- products (Product entities)
For placing the order, through a purely data-driven approach, you would get:
POST /pre-order-items
Request body:
- product uuid (string)
Response:
- order-item
POST /orders
Request body:
- pre-order-item uuid (string)
Response:
- order
But this cannot work with our database scheme: order items cannot exist without being part of an order. So we introduce a new pre-order-items entity and modify accordingly.
Let’s now build a client application for this API. To perform the checkout, the client application will perform a POST on /pre-order-items for every product, and when that is done, perform a POST on /orders providing all pre-order-item UUIDs.
Yeah, don’t do this.
For placing a single order, you are now performing 1 + n HTTP requests, with n being the number of products. If a customer orders 10 products, it will take around 10 seconds for the order to be placed. Furthermore, what do you do if creating one of the order items fails? How do you handle a rollback?
Now a new requirement comes in. There should be shipping costs included, but only for orders under EUR 40. No worries, you can just inspect the shopping basket, look at the total, and if it is less than EUR 40, you display the shipping costs. Right?
No! Stop! Please don’t do this. You’ve just encoded business logic in your API client. If you have 30 clients, do you expect them all to implement this logic? And what if it changes, and the cutoff becomes EUR 50? Or worse, what if the shipping costs will become dependent on the content of the shopping basket?
As I hope I’ve illustrated, a data-driven approach leads to performance issues (because it’s always going to be at best a very inefficient ORM), to business logic in your clients, and to inflexible processes in your API.
Let’s try something else.
Process-driven REST API design
Process-driven design advocates starting with the process, instead of with the data.
In fact, we basically ignore our data entities for designing our resources. Instead, we first design the processes that we want our API to support, and then we base our resources on the steps in the process.
The method of designing the process doesn’t matter much; you can use BPMN, a flowchart, use case descriptions or whatever you like. The goal is to find the states your API can be in. With this in hand, we can consider how best to represent the states themselves as well as the transfer between the states. REST: Representational State Transfer, remember?
For every step in the process, we consider 1) whether the step needs access to data on the server or not, and 2) whether the step contains business logic that is relevant for all client applications. If either of those criteria is true, we will design an API resource for the process step. If not, the process step is simply a client operation that the client will be responsible for.
Let’s apply this to our webshop:
Search products We look for zero or more products based on some given criteria. This requires data on the server, so a resource is warranted.
GET /products?{criteria}
Response:
- products (Product entities)
Add to basket This is a client operation as no data on the server is necessary. No resource is required.
View basket and costs This is an interesting one. The purpose of this step is to, given a set of products, provide an overview of what you have to pay. This is clearly business logic and as such it is the responsibility of the API to provide a resource to facilitate this process step.
We could introduce a “bill” resource for this purpose:
GET /bill?{product-list}
Response:
- product total (float)
- shipping costs (float)
- total (float)
This resource represents the business logic of calculating and presenting the right totals and costs to the user. It has no database representation; in programming terms, it would be a function.
This is a resource that wouldn’t appear in a data-driven approach at all, but very organically appears when considering the process: of course you want to see an overview of the costs! That’s an integral part of every webshop, so of course our API should support it.
Place order By providing one or more products, an order can be placed:
POST /orders
Request body:
- product uuids (list of strings)
Response:
- order
And that’s it. Three resources which directly model the processes we want to support.
Evaluating the result
The first observation is the complete absence of order items or their equivalent, while they seemed so important in the data-driven approach. Nowhere in the process are they relevant for the customer or the client application, so why would they enter the domain of our API? This makes the API significantly more straightforward.
Another thing of note is that we didn’t introduce a resource to access a single product (/products/{uuid}). While by itself that might not seem very earth-shattering, it is important that we made the conscious decision to do so. Designing an elegant and functional REST API requires thought and deliberation. The process-driven method stimulates this by making you think about every step, and helps you avoid unnecessary complexities.
The final interesting observation is that adding the requirement for the shipping costs does not change the designed process; the “view basket and costs” step still exists, it will merely be expanded with more data. This is actually one of the greatest properties of this approach. Because the processes your API has to support very rarely change fundamentally, APIs designed with this method are very good at adapting to changing requirements. They can easily grow to support new situations, while still allowing for plenty of flexibility, and without introducing unnecessary resources or being tied to your application internals.
Wrapping up
Process-driven REST API design leads to more flexible, simpler and clearer APIs which are more close to being truly RESTful. I have applied this method in multiple projects and while it takes more effort up front, it pays off significantly in the long run.
And of course, this is just a part of REST API design. In this article, I’ve largely glossed over the contents of POST and PUT requests, choosing the right HTTP verbs, the use of hypermedia and standardized media types, and plenty of other things that make up a good API.