Tuesday, November 18, 2008

REST via URI's and Body Representations

I've been thinking about REST URI's and their structure for a while now and been talking with my friend Josh about these.  As an exercise to try to codify my own views on this I have written up some of my thoughts on this topic. 

URI templates:
First order approach to REST is the URI used each day via GET
 
Good:
  • Can become book marks (thus IM'd, twitterd, and emailed to ones content)
  • Simple and easy to code to in a variety of languages and API's
  • Classic resource representation
Bad:
  • Limited ability to pass multiple parameters
  • Limited ability to deal with more than one facet  (at least with the example template below, any template could be established, but then everyone calling needs to know it)
  • Require a template even to pass one parameter
  • Can easily start to try and carry to much info with all sorts of verbs and values

So by example a URI template might look like:

.../resource/facet   (where the facet is some value like ../lithology/sand)

Then a template could define something like

.../resource/facet/[match]   Given a single value after facet, try to match it (equals or substring?)
.../resource/facet/[/min/max]  Given two values, assume they are a numerical range to search for

Of course, the issues already start to build up.  In the single value case we could just try to match but if we want a sub-string match then we are assuming we are not in a numerical environment.  Or we are requiring the implementation of the service to check for primitive data types which may or may not work.  In the second case our min max has to be sure to address numeric primitives (easy to check ) but also we may want to address non-numeric ranges (January/March)

There is also an interesting use of language that can be done here.  Where resource (singular) requires an id to define a single resource.  Conversely, a plural resources then would indicate an additional facet (either plural or singular) to carry the template forward.

So:
.../resources/sand   (all resources with sand)
.../resource/34/sand  (sand attribute (facet) of resource 34)

The use of singular and plural attributes in URI templates seems logical.  As a complete aside I have often wondered if a more Latin style grammar would allow more descriptive URI's to be formed since word order is less important and meaning is carried in word itself.  Totally impractical though of course. 

URI templates with representation in the body
So if our template is not enough, exploit the fact that the web architecture doesn't just pass URI's but additional elements in a request/response including headers and a body(representation).

So now our template (../resource/facet) might get a POST call with a payload of XML or JSON (pick your flavor I guess, how about Microformats in REST ).  That payload is free to define a wide variety of parameters and actions.

Good:
  • Far more flexible in terms of defining what we want to do with a resource facet
  • A response payload can define downstream (next) actions to take.  Thus giving us a kind of client moderated work flow (ie, 201 created or 303 see other next steps to take)
Bad
  • These workflows really can't be very complex since we are "waiting around" for the completion with our session.  Really only good for events which are "quick". *
  • The XM or JSON schema/language for the payload has to be agreed on before hand (though this is really true of the URI template too)

As an aspect of this the service might establish (via the 201 or 303 codes) a new URI that is a GET'able representation of the response of this (complete with etag and everything). The use of status codes (Ref:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html ) needs to be
integral part of any REST approach and REST clients need to realize
that if they are going to use the web architecture to get a resource
they must look for and deal with the status codes of that architecture.


However, there are issues with the service provider caching requests and responses:

  • How long does the service provider keep this?
  • A service may be called thousands of times by a single person in a sinle session, it doesn't scale (does it?) to keep and etag them all
  • Just let the web arch do the caching

Perhaps, one might allow this stored representation of the response to be the URI plus the POST'ed request and then if some does a GET on that URI representing this earlier POST request allow whatever web caching there is to address that scaling  Ie:

POST request to ../resource/facet 
response could be the actual data with a 201 created URI for future referencing.  That new URI, say (../resource/request/[id]) could then in the future be called via GET.  All the service provider is storing in that case is the URI plus the request payload, since the results would be rebuilt by the service engine (and then available for near term caching via existing web architecture).  Etags (hash sums) could be used to ensure requests are to a specific resource. 

One element I have been working to deal with in REST approaches is the whole "hypermedia as engine of application state".  It's a guiding principle and yet sometime hard to understand what exactly it is or if an approach is in compliance with it at times.

One approach would seem to be to say that all elements of a response must define down stream references as URI's or other (are there other?) hypermedia compliant references.  I am not sure if if an img tag or microformat or RDFa embedded content is compliant with this "application state" approach.  Also, how exactly does this impact the use use of XML or JSON.  While XML could be embedded into XHTML or similar effort done with microformats, does the use of JSON in the response to a REST URI mean that "hypermedia as engine of application state" is not even an option?  JSON embedded in XHTML or microformats perhaps might address this but then one argument the JSON proponents make is that they are tossing out the cruft.  One could also place the URI into the JSON as a value.  (Josh, I think you are looking into this..  I will be curious to read about)

Indeed it's also interesting to look at GRDDL and note how it approaches the walking and extraction of information (RDF) from documents (like XHTML) when looking at issues of REST. 

take care
Doug

Ref:
http://josh-in-antarctica.blogspot.com/2008/10/restful-query-urls.html
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
http://oreilly.com/catalog/9780596529260/
http://www.infoq.com/articles/tilkov-rest-doubts
http://www.infoq.com/articles/mark-baker-hypermedia
http://microformats.org/wiki/rest/json#Proposals
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
http://www.w3.org/2004/01/rdxh/spec

*  With regard to only good for quick work flows..  one must agree that it is easy to send in an event that may take a long time to run and simply reqister a message to a queue (ATOM feed perhaps) that has the current status and the next event on status update to undertake.  This turns the burden of monitoring and moving messages through the queue to the client in this case, but it does allow for a disconnected work flow in compliance with web architectural approaches. 

Thursday, November 06, 2008

JSR-311 and JPA with Intellij 8

IntelliJIDEA 8 recently came out and I wanted to take the time to try out a couple of interesting technologies as a means to test out the new version.  Specifically JSR-311 (Via Jersey ) and Java Persistence API (JPA) via Hibernate.  I created a simple application using these two technologies.   The data store I used in this was a PostgreSQL database with lithological data.  Note I actually did this with version 8.0 M1 of IntelliJ (build 8664) and I will go on the assumption nothing changed in this regard to the 8.0 final annouced just this last day or so. 

The first step was to register up the database as a datasource.  Under the TOOLS menu is the DATA SOURCE... menu item.  Selecting this allow us to register our database as a source through the dialog below:


Once this is done we can go ahead and create out new project.  You wouldn't have to register a data source first, but for sake of a canonical "10 minute demo" the net seems to love it makes life a bit easier as we are about to see.

Go ahead and start a new project with FILE -> NEW PROJECT.  Use the default "Create project from scratch" and name your project.  We will be building a "Java Module", the default option when creating a project.  Pick the default option for the src directory and when you get the "New Project" window selections options like in figure 2 below:


I did NOT select "Hibernate" at this stage, though I did select it under the pull down for the "JavaEE Persistence" option.  Also, under "Web Application" -> "WebServices" select the "Jersey" option.

The project will start to be fleshed out and you will be prompted to import the database schema (assuming you selected the "Import database schema" option).  Now select the source you created via the "Data Source Properties" dialog.  For me this was "Lith".   This database is just a test database for now so it consists of only a few tables with no defined relations at this time.  For purposes of this simple demo that should be fine. 

Go ahead and select a datasource and define a package name for the entities to be generated into. 



You will be prompted that the OR mapping is about to be generated, go ahead and let it start. Depending on your database and your machine this will take a minute or so. 

Once this was generated I was confronted with an Error that the org.hibernate.ejb.HibernatePersistence class/package could not be resolve for its reference in the persistence.xml file that was generated for us by this process.  Long story short I went to http://www.hibernate.org/6.html and downloaded the annotations, entitymanager and core (distribution) packages from here.  

Setting up the libraries (as is often the case) was the only real tedious part and in the end the library collection looked like the following figure:
NOTE:   I have to wonder if I would have selected "Hibernate" in the "New Project" dialog above (leaving the import and class generation option unchecked) if Intellij would have imported the Needed Hibernate libraries for me.
NOTE 2:  Don't forget to add in your database driver too  (not that I did that or anything)   



Parallel to all this JPA/Hibernate stuff Intellij has created a simple JSR-311 (Jersey) class for us with the following default structure:

package example; 

import com.sun.net.httpserver.HttpServer;
import com.sun.jersey.api.container.httpserver.HttpServerFactory;
import java.io.IOException;

import javax.ws.rs.GET;
import javax.ws.rs.ProduceMime;
import javax.ws.rs.Path;

// The Java class will be hosted at the URI path "/helloworld"
@Path("/helloworld")
public class HelloWorld {
// The Java method will process HTTP GET requests
@GET
// The Java method will produce content identified by the MIME Media type "text/plain"
@ProduceMime("text/plain")
public String getClichedMessage() {
// Return some cliched textual content
return "Hello World";
}

public static void main(String[] args) throws IOException {
HttpServer server = HttpServerFactory.create("http://localhost:9998/");
server.start();

System.out.println("Server running");
System.out.println("Visit: http://localhost:9998/helloworld");
System.out.println("Hit return to stop...");
System.in.read();
System.out.println("Stopping server");
server.stop(0);
System.out.println("Server stopped");
}
}

I wont bother to break down the structure or go into the various annotations used in JSR-311.  You can Google up quite a bit of material on all that of far higher quality than I could produce.  Starting at the Jersey site (https://jersey.dev.java.net/ ) is as good a place as any as I think it's likely the most evolved JSR-311 implementation at this time. 

For simplicity we will leave the main() method alone and modify a few other elements in this class.  First I changed the class level @Path annotation to:

@Path("/lith);

and also added in a create and close method for the JPA EntityManagerFactory.  For fun I modded the getClicedMessage to parse out the URI path sent to it via a:

@Path("location/{latlong}")  annotaion along with a @PathParam("latlong") String latlong annotation.  The later requires the javax.ws.rs.PathParam import. 

So our final code looks like this  (interesting parts in bold):

package example;

import com.sun.jersey.api.container.httpserver.HttpServerFactory;
import com.sun.net.httpserver.HttpServer;

import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.Persistence;
import javax.persistence.Query;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.PathParam;
import javax.ws.rs.ProduceMime;
import java.util.List;

@Path("/lith")
public class HelloWorld {

    private EntityManagerFactory emf = null;

    protected void createEMF() {
        emf = Persistence.createEntityManagerFactory("NewPeristenceUnit");
    }

    protected void closeEMF() throws Exception {
        emf.close();
    }


    public String doQuery(String latlong) {
        createEMF();
        StringBuffer sb = new StringBuffer();
        EntityManager em = emf.createEntityManager();
        Query query = em.createQuery("select c."+latlong+" from CmpEntity c");
        List<Double> list = query.getResultList();
        for (Double c : list) {
             sb.append(latlong + ": " + c.toString() + "\n");
        }
        try {
            closeEMF();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return sb.toString();
    }

    @GET
    @Path("location/{latlong}")
    @ProduceMime("text/plain")
    public String getClichedMessage(@PathParam("latlong") String latlong) {
        return doQuery(latlong);
    }


    public static void main(String[] args) throws Exception {

        HttpServer server = HttpServerFactory.create("http://localhost:9998/");
        server.start();

        System.out.println("Server running");
        System.out.println("Visit: http://localhost:9998/helloworld");
        System.out.println("Hit return to stop...");
        System.in.read();
        System.out.println("Stopping server");
        server.stop(0);
        System.out.println("Server stopped");
    }
}

A few things to note:
  • The List<Double> list = query.getResultList();  is not good(tm) as we have an unchecked assignment of java.util.List to java.util.List<java.lang.Double>    Better would be to use something like List<CmpEntity> in this case and alter our query to something like Query query = em.createQuery("select c from CmpEntity c");  However, I ran into some issues with return in my entity class being null.  Perhaps I needed to allow certain columns to be null, I am not certain of that, but I was more interested in the path than resolving the query.  Likely I will resolve this issue as the next step in this experiment. 
  • Parallel to the aboe the point my List<Double> work since I know that the return from the query is a Double
  • I altered things a bit in the above example so that I could use both /lith/location/longitude and /lith/location/latitude as calling URI's.   This works of course since I know ahead of time what my column names were that I wanted to use for this test and that they were of type Double to address the above two points.  This is whay I use the @Path and @PathParam annotations in the getClichedMessage method and strip out and pass along elements of the URI. 
  • Like you would NOT want (ie DO NOT WANT) to use the column names in your mapped data source in your REST URI.  You would establish your URI template to whatever degree you wished and then use a more logical approach to generating the structure and content of your services and their replies.  This post is talking about the pipeline and is not worried about the very important architectural issue of mapping resources to URI's.  Take a look at my friends Josh's post on that topic and the nice InfoQ How to GET a cup of Coffee posting about all that.

One last important point in all this.  I got an error: No identifier specified for entity.  When I looked at my generated entity classes all the columns recieved a @Basic and @Column(...) annotation.  At least one however needs to have a @Id annotation.  Reference the Hibernate FAQ at http://www.hibernate.org/329.html#A6 for more info.  I had a unique integer column I could do this to and so I alterd my effected entity classes with @Id @Column(...) on one of the methods.  You, however, should not see this as I suspect it was due to the fact I never botherd to set a unique key column in that table on my database (don't tell my DBA). 

That's about it.  At this point you should be able to run the project and make calls and get data out.  The built in REST client for IntelliJIDEA 8 is quite nice though perhaps not as advanced as the nice rest-client project.  In my case I parsed out the URI just for the sake of some fun.  With all the elements connected one can move on to more interesting "real world" applications of the JSR-311 and JPA API's used in this simple demo.