Skip to content

2_gaedo_and_collection_storage

Riduidel edited this page Jul 27, 2012 · 2 revisions

1_the_gaedo_case >>> gaedo and collection storage >>> 3_method_missing_vs_invokeLater

Since its beginning, object oriented programming suffered from impedance mismatch with databases. Google app engine is not an exception. Althgough datastore provides the ability to store any kind of data without bullying about missing columns, storing an obejct containing a collection of others obejct in a sane fashion reveals to be a rather complicated task.

An example

Take as an example the following pair of classes

public class User {
@Id private long id;
private String login;
private Collection<Post> posts = new LinkedList<Post>();
}
 
public class Post {
@Id private long id;
private String text;
private User author;
}

There are more than one mapping strategies possible for this collection.

Mapping of simple fields

But before, let me explain shortly how gaedo maps a Java object to an entity. This is rather simple : the field annotated with @Id (which must be a long) is used as a placeholder for object's Key (which kind is associated to object class in service and name is associated to real class), and other fields are associated to entity properties. As an example, the Post class is associated to an entity which looks like

Key Post.text Post.author
Post={id} {text} User={author.id}

As one may already notice, a domain object is always replaced, when used as field from another object or in a collection, by its key. This allow us to escape from the dreaded issue of JPA/JDP, which requires user to put GAE keys in its model when trying to establish relations between objects.

However, when it comes to collections, the issue is a little more complex.

Mapping collection with dynamic properties

First solution we used was to create properties for each entries. As an example, if User named toto has written posts with texts "A" and "B" we would have the following entities in datastore (for the sake of this example, I removed the user from the Post, to well emphasize the search issue).

The two posts

Key Post.text
Post=1 A
Post=2 B

And the User

Key User.login User.posts.count User.posts.0 User.posts.1
User=1 toto 2 Post=1 Post=2

This was a simple storage solution, which has one HUGE drawback : how can I find, as an example, the author of Post 1 ? Take a look at GQL, and you'll find that a query requires a property name. So, do we have to look in column named posts.0 ? or in posts.1 ? or in both ? And how will we handle the case of a user who wrote 3 posts ? Clearly, this doesn't scale. As a consequence, we preferred (after having code this alpha-level prototype) to use the sub-entity mechanism.

Mapping collections with subentities

Indeed, google datastore allows keys to be hierarchical (see Keyfactory.createKey(Key parent, java.lang.String kind, long id)). As a consequence, we will create a sub-entity group for each collection of the object, and our example will become (with the Post pair unchanged)

Key User.login
User=1 toto
Key User.post.value
User=1/posts.0 Post=1
User=1/posts.1 Post=2

Using this strategy, will simply consists in querying upon User.post.value property name. obviously, it increases complexity of data structure, since collections are now stored outside of objects. however, it seems to me more clear, and has the added benefit of allowing easy storage of maps, what previous model didn't allowed.