I blog when I go abroad, and occasionally when I do stuff in the UK too. There's a nicer interface over here.

Monday, September 29, 2008

A new type of django relationship: Generic Intermediaries

Bloody hell, it's a second technical post in the space of a week. I was really bored last night (when I wrote most of it, as the publish date suggests); had seen both of the Family Guy episodes on FX several times before, and similarly I've seen Die Hard enough times for it to not really require another viewing. Now, if it had been in HD... anyway, the upshot was that out came OmniGraffle, before I knew it I'd created a diagram and then, well, a picture needs a thousand words of explanation. So, after the lozenge, here they are.

NB this stuff is also included in the django-slots wiki; I thought it would be sensible to post it somewhere that might have an audience, as well as this blog.


Generic intermediaries: relationships with characteristics

Introduction

This document describes the GenericIntermediary django model and IntermediaryKey, a key-like object. Together these two classes provide a mechanism for giving characteristics to relationships between models.

Existing relationships in django

fixed relationships

Django already provides relationships between models. These allow you to link single or multiple instances of models to one another. Their existence is reflected in the database schema behind those models, be it generated when using syncdb or defined explicitly with dmigrations. I'm calling these relationships fixed because the model on either side of the relationship is explicitly specified in the code.

generic relationships

The content types application (django.contrib.contenttypes) ships with django and is in INSTALLED_APPS by default. As well as providing a unique identifier to all model instances in your project through an app/model/id triplet, you also get the ability to specify a generic foreign key and/or generic relation. This lets you genericise one side of a foreign key relationship: that is, specify that your model can be attached to any other model. This relationship is specified by using two fields: a ForeignKey to ContentType, and a regular field used to store the ID of an instance of that type. As with the fixed relationships, therefore, this requires columns in your schema, to reflect the fact that the model is related to something else.

Generic intermediaries

Generic intermediaries are a way of specifying that a relationship exists between two model types separately from the instances of those models. The relationship is then given characteristics through a new model, in which the fields containing the instance IDs are also stored. This model can then be used to create a mixin, a Manager-style object or Key-style object, to give new attributes to existing models without requiring schema changes. This is how django-slots is implemented.

Diagram

This is a diagram of how django-slots is implemented, including the slots_demo app which provides the Page and Style models.



Explanation

Page and Style are django models, implemented as normal, with whatever attributes they require.

Between them is GerenicIntermediary In concrete terms this is a model with just two attributes, each of them a ForeignKey on ContentType and a unique_together constraint ensuring only one relationship between two types -- in one direction -- can exist. The direction is important: as with the diagram, the two keys represent the models on the _left_ and _right_ hand side. The left-hand model is that which the right-hand types are _against_; in django-slots therefore Page is on the left.

Slot is a django model which has a ForeignKey on GenericIntermediary This is, in effect, a declaration that Slot implements characteristics of a relationship. Missing from the diagram (bolded to remind the author to remedy this!) are the attributes which contain the IDs of the instances which are related, that is, the ID of the Page objects and that of the Style objects.

Left at this, scheduling would be possible. You would create a slot like this:


# assume we have Page and Style objects called page
# and style respectively; we also have two datetime
# objects, start_time and end_time
cp = ContentType.objects.get_for_model(Page)
cs = ContentType.objects.get_for_model(Style)
gi = GenericIntermediary.objects.get(left=cp, right=cs)
slot = Slot(relationship=gi, against_object_id = page.id,
slotted_object_id = style.id,
start_time = start_time, end_time = end_time)


and retrieve it so:


# same assumptions as above; also same cp, cs,
# and gi assignments
now = datetime.datetime.now()
# look for a slot that now falls inside,
# against our page
try:
current_style_slot = Slot.objects.get(
relationship=gi, start_time__gte=now,
end_time__lte=now,
against_object_id = page.id)
except Slot.DoesNotExist:
current_style_slot = None
else:
current_style = cs.get_object_for_this_type(
id=current_style_slot.slotted_object_id)


This is horribly verbose and inconvenient. It's also not required.

Intermediary keys

Also missing from the diagram above is IntermediaryKey. As the name suggests this is a key-like object which relates to the GenericIntermediary. Informed heavily by the GenericForeignKey API, IntermediaryKey works by specifying which two fields together point to the instances on either side of the relationship. The first argument denotes both the relationship field (the foreign key on GenericIntermediary) and the side of the relationship, using normal django key__attr syntax; attr will always be one of left or right.

By having an IntermediaryKey the model gets an attribute which, like the fixed relationships, returns the actual instance of the related model.

This is how Slot uses IntermediaryKey


against = IntermediaryKey('relationship__left',
'against_object_id')
slotted = IntermediaryKey('relationship__right',
'slotted_object_id')


all this really gives us is the ability to use .against and .slotted as shortcuts to the instances of Page and Style in a relationship. The only improvement we can make to the previous examples is to shorten the current_style assignment:


current_style = current_style_slot.slotted


Still horrible, though.

Usage by django-slots

All the verbosity can be reduced (to taste) by the implementation of a class to define characteristics of the relationships, and the use of techniques to attach these classes to existing models.

django-slots' Slot model/class is the first such relationship (because GenericIntermediary and IntermediaryKey were invented for this project!); ScheduleMixin is the technique which attaches them to existing models.

The introductory blog post explains at a high-level what this means, in that it shows the API of django-slots. To fully understand the way to get from the above code to provision of attributes and methods, read up on mixin classes and see ScheduleMixin in models.py

Conclusion

GenericIntermediary and IntermediaryKey are not replacements for fixed relationships, nor generic relationships. Instead they are a way of representing the fact that a relationship exists between two arbitrary classes separately from the instances of those classes in the relationship. This is useful where:

  • the relationship between two models has characteristics itself;
  • one model's relationship with another is not, or need not be, an attribute of either;
  • a model wants to declare which other models are related to it, rather than the other way round; or
  • there is a need for another model to key on your own, when you cannot change its schema (eg in 3rd party apps you don't want to fork)

The mixin technique currently employed by django-slots demonstrates the first three of these use cases:

  • the relationship exists between two times
  • Style and Page are separate models with no explicit fixed relationships
  • Page declares that it would like Style to be attached to it; Style does not declare itself as tied to Page -- or anything at all

other random thoughts

I don't believe time is the only characteristic that could use this technique, which is why I've written such verbose documentation. I'm struggling to come up with proper use cases for, say, geographic foreign keys (where instead of start_time and end_time you might declare a bounding box, or latlong + radius?), but I have a gut feeling it could be useful.

No comments: