Skip to content
carterpage edited this page Aug 9, 2011 · 7 revisions

Yoga's REST Selectors

Inspired by LinkedIn’s JavaOne presentation on building flexible REST interfaces, Yoga is a framework for supporting REST-like URI requests with field selectors.

Yoga selectors allow you to:

  1. Combine multiple queries into a single "relational" query
  2. Make expensive data optional

You can make any existing REST API built in Java (for now) significantly faster and more powerful.

Yoga combines multiple queries into a single "relational" query

Let's look at the basic data model in the Yoga demo, which represents a simple music-based social network application, consisting of users, artists, albums, and songs:

Yoga Demo

Let's say I want to write an iPhone application that auto-generates buddy playlists by intersecting songs on my devices against songs by my friends' favorite artists.

Using a traditional RESTful approach, I would make five sets of queries (actually many more individual queries) to retrieve the data I'm interested in. (I could also write a custom method to handle this requirement, e.g. /friendSongs/1.json, but more on that later.) The sequence of steps is diagrammed here:

Yoga Demo Yoga Demo Yoga Demo Yoga Demo Yoga Demo

Or in HTTP calls, it looks something like this:

GET /user/1.json (Get user)

GET /user/2.json (Get detailed friend entities)
GET /user/3.json
...

GET /artist/1.json (Get favorite artists)
GET /artist/2.json
...

GET /album/1.json (Get albums for artists)
GET /album/2.json
...

GET /song/1.json (Get songs for albums)
GET /song/2.json
...

What Yoga allows you to do is to compact all these calls into one single query:

Yoga Demo

Or:

GET /user/1.json?selector=:(friends:(favoriteArtists:(albums:(songs))))

WHY?

  1. Faster: In a typical system, more time is spent establishing each network connection and moving the data across the network than formulating the response on the server. Fewer requests == more speed. On mobile or slow connections, this acceleration only intensifies.
  2. Fewer sockets: Each connection requires a dedicated socket from the server, which is a finite resource. Climbing through the object graph one call at a time increases the total cost of serving the request. Fewer sockets == lower cost. If a user that has 10 friends, with 3 favorite artists, who have recorded 5 10-song albums, atomic RESTful entity retrievals, would require 1,500 requests. How do you get around that? Write custom code, or extend the API with Yoga -- which requires one request.
  3. Simpler client code: One request with one response is much easier to navigate in your code than multiple nested requests. It could use a little further prettying, but you can see in the code for our demo that the interaction piece of the Yoga client code is half as long as that of the traditional approach. Simpler code == Happier client developers.

WHY NOT... create a custom query?

That is the standard approach for this sort of problem if the compound request latency for doing standard entity navigation is just too much. (Like it can be in this example.)

We discovered a few problems with this approach in practice:

  1. Development velocity hit: You slow down the client development by adding a dependency on the server side every time there is a change. If the server developers don't have the bandwidth, you have an even bigger problem.
  2. Less iterative: Requirements change. A lot. When developing a product, it makes sense to try out different things, add an element here, remove or modify and element there. Without a flexible API this forces back to more waterfall-y types of thinking.
  3. Backwards compatibility: Each custom solution needs to be supported either forever, or until you can confirm it's not being used. Either way, a long time, and a lot of legacy code.

Yoga makes expensive data optional

We already looked at one type of expensive data -- grabbing the nested children of an object. But there are sometimes individual fields that are expensive. Because of a calculation or complicated underlying query. In the long-run you want to optimize those queries to make them as cheap as possible, but in the short run you can release the functionality by limiting it's impact to clients that explicitly request it.

Let's say hypothetically we want to add a field to a user that calculates an album recommendation on the fly, which could be more data intensive than a simple field retrieval. Eventually you want a scheduled batch process to pre-calculate that out for you, but maybe you don't know if you are going to need or want it in the long run. In the short term you can use Yoga to hide the functionality so a client has to explicitly request it, that will keep the impact low enough that you can release it and test it out before building a whole bunch of optimization processes around it.

In this case, a request to include an optional field in the return object looks something like this:

GET /user/1.json?selector=:(recommendedAlbum)

A standard request without the selector would not return the album.