Tuesday, November 18, 2008

REST via URI's and Body Representations

I've been thinking about REST URI's and their structure for a while now and been talking with my friend Josh about these.  As an exercise to try to codify my own views on this I have written up some of my thoughts on this topic. 

URI templates:
First order approach to REST is the URI used each day via GET
  • Can become book marks (thus IM'd, twitterd, and emailed to ones content)
  • Simple and easy to code to in a variety of languages and API's
  • Classic resource representation
  • Limited ability to pass multiple parameters
  • Limited ability to deal with more than one facet  (at least with the example template below, any template could be established, but then everyone calling needs to know it)
  • Require a template even to pass one parameter
  • Can easily start to try and carry to much info with all sorts of verbs and values

So by example a URI template might look like:

.../resource/facet   (where the facet is some value like ../lithology/sand)

Then a template could define something like

.../resource/facet/[match]   Given a single value after facet, try to match it (equals or substring?)
.../resource/facet/[/min/max]  Given two values, assume they are a numerical range to search for

Of course, the issues already start to build up.  In the single value case we could just try to match but if we want a sub-string match then we are assuming we are not in a numerical environment.  Or we are requiring the implementation of the service to check for primitive data types which may or may not work.  In the second case our min max has to be sure to address numeric primitives (easy to check ) but also we may want to address non-numeric ranges (January/March)

There is also an interesting use of language that can be done here.  Where resource (singular) requires an id to define a single resource.  Conversely, a plural resources then would indicate an additional facet (either plural or singular) to carry the template forward.

.../resources/sand   (all resources with sand)
.../resource/34/sand  (sand attribute (facet) of resource 34)

The use of singular and plural attributes in URI templates seems logical.  As a complete aside I have often wondered if a more Latin style grammar would allow more descriptive URI's to be formed since word order is less important and meaning is carried in word itself.  Totally impractical though of course. 

URI templates with representation in the body
So if our template is not enough, exploit the fact that the web architecture doesn't just pass URI's but additional elements in a request/response including headers and a body(representation).

So now our template (../resource/facet) might get a POST call with a payload of XML or JSON (pick your flavor I guess, how about Microformats in REST ).  That payload is free to define a wide variety of parameters and actions.

  • Far more flexible in terms of defining what we want to do with a resource facet
  • A response payload can define downstream (next) actions to take.  Thus giving us a kind of client moderated work flow (ie, 201 created or 303 see other next steps to take)
  • These workflows really can't be very complex since we are "waiting around" for the completion with our session.  Really only good for events which are "quick". *
  • The XM or JSON schema/language for the payload has to be agreed on before hand (though this is really true of the URI template too)

As an aspect of this the service might establish (via the 201 or 303 codes) a new URI that is a GET'able representation of the response of this (complete with etag and everything). The use of status codes (Ref:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html ) needs to be
integral part of any REST approach and REST clients need to realize
that if they are going to use the web architecture to get a resource
they must look for and deal with the status codes of that architecture.

However, there are issues with the service provider caching requests and responses:

  • How long does the service provider keep this?
  • A service may be called thousands of times by a single person in a sinle session, it doesn't scale (does it?) to keep and etag them all
  • Just let the web arch do the caching

Perhaps, one might allow this stored representation of the response to be the URI plus the POST'ed request and then if some does a GET on that URI representing this earlier POST request allow whatever web caching there is to address that scaling  Ie:

POST request to ../resource/facet 
response could be the actual data with a 201 created URI for future referencing.  That new URI, say (../resource/request/[id]) could then in the future be called via GET.  All the service provider is storing in that case is the URI plus the request payload, since the results would be rebuilt by the service engine (and then available for near term caching via existing web architecture).  Etags (hash sums) could be used to ensure requests are to a specific resource. 

One element I have been working to deal with in REST approaches is the whole "hypermedia as engine of application state".  It's a guiding principle and yet sometime hard to understand what exactly it is or if an approach is in compliance with it at times.

One approach would seem to be to say that all elements of a response must define down stream references as URI's or other (are there other?) hypermedia compliant references.  I am not sure if if an img tag or microformat or RDFa embedded content is compliant with this "application state" approach.  Also, how exactly does this impact the use use of XML or JSON.  While XML could be embedded into XHTML or similar effort done with microformats, does the use of JSON in the response to a REST URI mean that "hypermedia as engine of application state" is not even an option?  JSON embedded in XHTML or microformats perhaps might address this but then one argument the JSON proponents make is that they are tossing out the cruft.  One could also place the URI into the JSON as a value.  (Josh, I think you are looking into this..  I will be curious to read about)

Indeed it's also interesting to look at GRDDL and note how it approaches the walking and extraction of information (RDF) from documents (like XHTML) when looking at issues of REST. 

take care


*  With regard to only good for quick work flows..  one must agree that it is easy to send in an event that may take a long time to run and simply reqister a message to a queue (ATOM feed perhaps) that has the current status and the next event on status update to undertake.  This turns the burden of monitoring and moving messages through the queue to the client in this case, but it does allow for a disconnected work flow in compliance with web architectural approaches. 

No comments: