Approaching pure REST: Learning to love HATEOAS

2010-06-21T16:14:18-07:00

We've been building up a REST API at work for a couple of months now, have an iPhone client, an Android client, and a browser-based client built on it, and are well on our way to using it for a number of other purposes. As far as client and API development are concerned, things are going pretty smoothly. So, when I read Michael Bleigh's article on how he thinks that building a pure REST API is too hard, and probably not worth the time, I was pretty surprised. I started wondering if maybe I'd misunderstood something, despite spending quite some time poring over Roy Fielding's dissertation and scads of other articles by a variety of authors. After some reflection, I've decided that I'm not missing anything, and it's a lot easier than people think to build a pure REST API, once you understand what one is, and have determined that REST is an appropriate architecture for your system.

Learn about REST from credible sources

There's a lot of information out there about REST, so naturally there's also a lot of inaccurate, incomplete, confusing, and misleading information out there. The key to learning about REST, as with everything else, is finding good sources of information. In my research, there were a handful of people who were instrumental in my understanding of the topic: Roy Fielding, Jim Webber, Ian Robinson, Mark Baker, Sam Ruby, Leonard Richardson, and Subbu Allamaraju. Specifically, the following are great sources of information on REST:

Obviously, the section on REST in Roy Fielding's dissertation
Roy's rant about true REST requiring hypermedia
The book "RESTful Web Services" by Sam Ruby and Leonard Richardson
The slides from Ian Robinson and Jim Webber's 2009 QCon tutorial on "REST in Practice"
The PDF and video of Ian Robinson's 2008 QCon talk on RESTful Enterprise Development with Atom and AtomPub, and his comment on Jim Webber's blog afterwards
The InfoQ interview with Mark Baker about his REST evangelism, what REST is, and why it's important
Subbu Allamaraju's article on practical use of media types, especially the comments that follow, including a link to a good summary of Fielding's ideas on media types

REST is a fairly broad topic to understand completely. Only after reading through the above sources, and then some, did I begin to fully grasp what it is, and why it's so important. Due to the contextualized nature of this article, I strongly suggest that you at least skim through the links above before continuing on here.

HATEOAS: The "hard part" about REST

The most sparsely-documented aspect of REST in Roy Fielding's dissertation is the "hypermedia as the engine of application state" (HATEOAS) constraint; it's also the aspect of REST that has the fewest practical examples. It's no surprise, then, that HATEOAS is the part of REST most-often neglected or misinterpreted by API developers. There is one rather popular and ubiquitous hypertext-driven application out there to learn from, however: The World Wide Web.

A popular, yet misleading example

The world-wide web is a great example of utilizing hypermedia; it's also a source of much confusion around the role of hypermedia in APIs. In his article, Michael Bleigh uses the same example that Fielding uses in a comment on his rant of how web browsers and spiders don't distinguish between an online-banking resource and a wiki resource, and only need to be aware of the "links and forms", and "what semantics/actions" are implied by traversing those links". Fielding uses the example as a way to illustrate how clients of REST services need only know about the standard media types used in a service response in order to make use of the service. The problem with this analogy is that browsers and spiders are extremely general consumers of hypermedia.

A web site can be thought of as a REST API, except the media types and link relations that are used are not structured enough to allow most applications to determine the meaning of individual links and forms. Instead, a web site typically relies on a human user to determine the semantics of the site from the natural language text, sounds, and visuals presented to them via HTML, Javascript, Flash, JPEGs, or any other standard data formats at the web site author's disposal. That's fine if you're building a system designed to be used by human beings, but if you're designing one to be consumed by other software, then your system and the systems that use it are going to have to agree on something up-front in order to play nicely.

A more common example

It's an exciting, web-driven world we're building, and clients usually need to do quite a bit more than allow their users to browse or crawl the systems we build. The Twitter API, for example, has services that allow clients to update their status, or retweet one that already exists. Twitter's API is not RESTful, so the documentation for retweeting a status instructs developers to call the service by sending an HTTP POST or PUT request to http://api.twitter.com/1/statuses/retweet/[id].[format].

If the Twitter API were RESTful, clients would need to understand what it means to follow a link to retweet a status. The semantics of such a service are deeper than what Fielding talks about in his comment about browsers and crawlers. At the same time, I think that this deep level of understanding is a more common requirement for clients of web APIs. It's this need for clients to have deep semantic understanding of an API that has Michael Bleigh questioning whether it's even worth the effort to make an API like Twitter's hypertext-driven.

It's not that hard!

HATEOAS is not as difficult to adhere to as people think. If you've ever built an interactive website that includes links and forms in HTML pages it generates and returns to its users, then congratulations: You've conformed to HATEOAS! For an API to conform to HATEOAS, it must provide all valid operations, what they mean, and how to invoke them, in representations that it sends to clients. In order to provide this knowledge in-band, it must utilize standard media types and link relations. If the API can't use standard types and relations, then custom types must be defined. Regardless of what types and relations are used, the point is that clients should be bound to services they consume at a higher, more generalized level than a specific communication channel, URI pattern, and set of invocation rules. The only difference between building a web site and web API that conforms to HATEOAS is that the majority of media types and link relations that you'll need for a web site are already defined, whereas you'll most-likely need to define some of your own for an API.

Michael Bleigh states in his article that REST requires "too much work from the [service] provider in defining and supporting custom media types with complex modeled relationships", but defining media types and link relations for a REST API simply takes the place of other forms of documentation that would otherwise have to be produced. Subbu Allamaraju has a pretty good article on documenting RESTful applications. Among other things, he highlights how you no longer need to specify details on constructing requests to specific services within a REST API. The hypermedia constraint of REST requires that all possible requests be constructable at runtime, and provided by the API itself; clients must know how to interpret the hypermedia controls, but then are only responsible for interpreting the semantics of specific links and structural elements of the data format. This allows for greater flexibility on the service side to make changes, and greater resilience on the client side to those changes.

Michael states that it's too much work to define "complex modeled relationships", but defining link relations in a REST API's media type definitions is not any more complex than leaving those relationships undefined and requiring clients to figure them out on their own by poking around the documentation. The difference is just that the effort of figuring those relationships out and working with them in a consistent way gets shifted from the (single) service developer to the (multitude of) client developers. If links are provided, but relationship types are not defined or specified, clients must base their behavior on specific links, thus making it harder to change those links. As was stated earlier, having well-defined relationships also encourages consistency and sound design from the service developers, and improves the ease with which clients make sense of and build solutions against an API.

Benefits of HATEOAS

There are many short- and long-term benefits of HATEOAS. Many of the benefits that Roy Fielding talks about, such as supporting unanticipated use-cases, the ability of generalized clients to crawl a service, and reducing or eliminating coupling between a service and its clients, tend only to be fully realized after a service has been available to the public for a while. Craig McClanahan -- author of the Sun Cloud API -- suggested some short-term benefits of adhering to the HATEOAS constraint of REST. One of the benefits mentioned by McClanahan is the improved ability of a service to make changes without breaking clients. Subbu Allamaraju describes, in his previously-mentioned article, the simplification that REST lends to the documentation of services that are written within its constraints.

Another short-term benefit of HATEOAS is that it simplifies testing. I've worked on a number of projects where there was a QA team comprised of non-technical people, and my current project is no exception. The QA team performs primarily manual testing of our applications, and leave the automated testing to the developers. They do, however, have a few people on their team capable of writing automated tests. So, I wrote a simple client against our REST API with knowledge of only the hypermedia semantics, which translates the hypermedia controls of our API into HTML links and forms that can then be driven by testing tools like Selenium. This had the added benefit of allowing me to kick the tires, as it were, of the API early on and make sure that we were getting things right by writing a simple client against it. Since the client is bound only to the hypermedia semantics of the API, it's incredibly resilient to change. Also, having the QA team rely on an HTML client ensures that all aspects of the API are hypertext-driven; if they weren't, they couldn't be tested and would be kicked back to development.

It's also possible that defining media types and link relations used by an API in a standardized, generalized fashion, as required by REST, encourages the developers of an application to think about the consistency and structural clarity of their services; I definitely feel like this is the case when I work on or with REST APIs. The reasons for this higher-level of thought about the API may be due to the fact that, once the hypermedia controls have been defined, the technical details are pretty much out of the way and developers are left with determining the structure of the system their building; it's classic separation of concerns, which has yielded great results from conscientious developers for decades. In addition to encouraging better API design, defining the hypermedia controls and link relations in a consistent, standardized fashion also improves the client developers' ability to make assumptions about the API, thus improving their productivity.

A solution built on top of a true REST API can be bound to the relations and semantics in a media type, whereas a solution built on top of a partially-RESTful API, as they are typically built, is bound to each individual service in the API and the (often-times undocumented and implied) relationships between those services. Michael Bleigh suggests in his article that it's "too much work for clients and library authors to perform complex aggregation and re-formulation of data" when they're built on a true REST API. With the advantages of REST that have been mentioned so far, it should be the architecture of choice for client or library developers that are concerned with building systems that are resilient to or tolerant of common changes in and challenges of a web-based system. Developers that would like the flexibility to bind to the semantic subset of an API that is appropriate to their client or library (general hypermedia, or deep semantic understanding) would also probably prefer a REST API.

Of course, there are also the often-mentioned reasons for using REST: Scalability and maintainability. You can't work on a web application these days without having to build an API to drive iPhone, Android, mobile web, or some other client. Twitter's API has quite a few clients already (318 as of 6/21/2010). With so many clients likely to be built against a modern web API, it's important that one be written in a way that is as scalable and maintainable as possible, for the sake of both the clients and the system providing the services. The advantage of exposing services over the web is that, with REST, it's already designed to be massively scalable and maintainable, and the better a web API is at playing by the rules of the web, the more it can take advantage of those properties.

kinderman.net : Tag rest, everything about rest