Saturday, 1 February 2014

Pagination with App Engine cursors in django-nonrel

First published on adamalton.co.uk on 21st August 2011.

A much more extensive version of this post is available on the Potato London website.

The stack:

  • django (non-rel)
  • Google AppEngine
  • djangooappengine

The task:

  • Using Datastore cursors to paginate through a metric bucket load of objects.
Note: if you're using the webapp framework rather than Django with djangoappengine then this blog post is not for you.

So you've got  a model with thousands objects, and you want to display them on a page a few at a time with some lovely pagination.  Django's standard paginator (django.core.paginator) is a bit dumb, in that it fetches all of the objects and then paginates through them, which other than defeating half the point of paginating, means that this doesn't really work on App Engine because by default MyModel.objects.all() will only return the first 1000 objects.

So we need to write some custom pagination code.  On AppEngine, MyModel.objects.all()[1001:2000] will work, and will correctly return you objects 1001 to 2000, so we could just use that.  But the beauty of AppEngine is the way that it scales, so we should think big.  When you get to MyModel.objects.all()[100000:100500] the Datastore is going to have to trawl its way through the first 100000 Entities, and then give you the next 500, and that is going to be kind of slow.  This is where Datastore cursors come in.

What are cursors?

A Datastore cursor is just a marker to a starting point in query.  It's just like giving an OFFSET to a SQL query, but it's much more efficient because it allows the datastore to jump to the starting point, instead of having to trawl its way through the first <OFFSET> objects before it starts.
Remember: cursors are an AppEngine Datastore thing.  Models are a Django thing.  But djangoappengine has glued the two together for us with 2 functions:
  • get_cursor - give it a Django queryset and it returns the Datastore cursor which marks the position of the end of that query.
  • set_cursor - give it a Django queryset and a cursor and it returns a new queryset which will return its results starting at the object marked by the cursor.

Let's write some code


#views.py
from djangoappengine.db.utils import get_cursor, set_cursor

def my_page(request):
    """ View some objects. """
    results_per_page = 100
    queryset = MyModel.objects.all()
    #Note, it hasn't called the datastore yet as querysets are lazy
    cursor = request.GET.get('cursor')
    if cursor:
        queryset = set_cursor(queryset, cursor)
    results =queryset[0:results_per_page] #starts at the offset marked by the cursor
    cursor_for_next_page = get_cursor(results)
    template_vars = {
        "results": results,
        "cursor_for_next_page": cursor_for_next_page,
    }
    return render_to_response("template.html", template_vars)
#template.html
<h1>Page 1</h1>
<a href="/myview/?cursor={{cursor_for_next_page}}">Next page</a>
{% for object in page1 %}
    {{object}} {#whatever you're displaying #}
{% endfor %}

This is fine when you just want a 'next' link, but what if you want to be able to skip to page 5 or page 10? Then it gets a bit more tricky.
It's time for me to eat some dinner, but if this post gets some comments then I'll expand on this stuff a bit more and reveal ways of making pagination links to other pages (including the previous page).



Custom Django Template Tags in Google App Engine

First published on adamalton.co.uk 24th May 2010.

Google App Engine uses the Django templating system, but its own AppEngine-ified version of it. This means that if you want to write your own custom template tags you have to do a bit of jiggery pokery in order to bring the required bits of Django and App Engine into harmony with one another. I Googled for how to do this, and found very little, so now that I've done it, here's how... There are 2 different varieties of the Django templating framework floating around in App Engine, and in order to write a custom template tag you need to use bits of both of them. First up, the AppEngine-ified version, this is the one that we'll need to register the template tag with: from google.appengine.ext.webapp import template And secondly, the Django one, which is what we'll use to build our template tag: from django import template as django_template Make sure you import them as different names, otherwise you'll be screwed. Next up, write your custom template tag, making sure to use things from django_template (or whatever you imported it as) where necessary:
def my_tag(parser, token):
    bits = list(token.split_contents())
    if len(bits) != 2:
        raise django_template.TemplateSyntaxError, "Error!!"
    return MyNode(bits[1])


class MyNode(django_template.Node):
    def __init__(self, my_var):
        self.my_var = my_var

    def render(self, context):
        try:
            my_var = django_template.resolve_variable(self.my_var, context)
        except django_template.VariableDoesNotExist:
            my_var = None
        return "my var is: %s" % my_var
Everything as normal there, but using resolve_variable, Node, TemplateSyntaxError and VariableDoesNotExist from the django template module rather than the App Engine template module. Alternatively you could do:
from django.template import (
    resolve_variable,
    Node,
    TemplateSyntaxError,
    VariableDoesNotExist
And next: You need to register your template tag with the App Engine template framework:
register = template.create_template_register()
my_tag = register.tag(my_tag)
Remember, we imported template from google.appengine.ext.webapp. And finally... We need to register the module which has got our custom template tag in it: In main.py (or whatever file is handling your request):
from google.appengine.ext.webapp import template
template.register_template_library('file_which_contains_your_custom_tag')
Smashing.


Displaying Django GenericForeignKey As Single Form Field

First published on adamalton.co.uk on 31st March 2010.

The Django content types framework lets you have Generic Foreign Key fields, which allow you to create a foreign key to any object (record) in your database by combining the object id with an identifier for the table in which the object lives. This is great, until you want to make it editable in a form. The GenericForeignKey reference is stored as 2 separate fields (object id and type id), and so when you view this in a standard model form you get a drop down of content types, and a text field in which to type the id of the object that you want. This is fairly meaningless to a user. Even if you know what the 2 fields mean, you still need a way of looking up the id of the object that you want.

So we want to be able to combine these 2 fields into a single, meaningful field, that lets the user simply select which object they want. The context in which I wrote this code is rather complex, and not worth going into here. So what is below is a simplified version, using an imaginary AttachableNote model as an example of something which may have a generic foriegn key to link itself to any other object in the site.

import re
from django import forms
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes import generic

class AttachableNote(models.Model):
    """ A model which stores a text note.
        It can be attached to any object via its GenericForeignKey field.
    """
    text = models.TextField()
    object_id = models.PositiveIntegerField()
    object_type = models.ForeignKey(ContentType)
    generic_obj = generic.GenericForeignKey('object_type', 'object_id')

class AttachableNoteForm(forms.ModelForm):
    """ Form for creating an AttachableNote. """
    
    #GenericForeignKey form field, will hold combined object_type and object_id
    generic_obj = forms.ChoiceField()
    
    def __init__(self, *args, **kwargs):
        super(AttachableNoteForm, self).__init__(*args, **kwargs)
        #combine object_type and object_id into a single 'generic_obj' field
        #getall the objects that we want the user to be able to choose from
        available_objects = list(SomeModel.objects.all()) #put your stuff here
        available_objects += list(SomeOtherModel.objects.filter(field=value))
        #now create our list of choices for the <select> field
        object_choices = []
        for obj in available_objects:
            type_id = ContentType.objects.get_for_model(obj.__class__).id
            obj_id = obj.id
            form_value = "type:%s-id:%s" % (type_id, obj_id) #e.g."type:12-id:3"
            display_text = str(obj)
            object_choices.append([form_value, display_text])
        self.fields['generic_obj'].choices = object_choices
    
    class Meta:
        model = AttachableNote
        fields = [
            "text",
            "generic_obj"
        ]
    
    def save(self, *args, **kwargs):
        #get object_type and object_id values from combined generic_obj field
        object_string = self.cleaned_data['generic_obj']
        matches = re.match("type:(\d+)-id:(\d+)", object_string).groups()
        object_type_id = matches[0] #get 45 from "type:45-id:38"
        object_id = matches[1] #get 38 from "type:45-id:38"
        object_type = ContentType.objects.get(id=object_type_id)
        self.cleaned_data['object_type'] = object_type_id
        self.cleaned_data['object_id'] = object_id
        self.instance.object_id = object_id
        self.instance.object_type = object_type
        return super(AttachableNoteForm, self).save(*args, **kwargs)

Replacing Question Mark in URL With Javascript

First published on adamalton.co.uk on 29th March 2010.

This seemingly simple task caught me out slightly the other day.  The usual trip to Google was surprisingly un-helpful.  And then I had an important realisation... document.location.replace(/regex/, 'replacement') IS NOT THE SAME AS String(document.location).replace(/regex/, 'replacement') document.location is an object, which has a method called 'replace', which reloads the page with the given url string.  Which of course is totally different to the Javascript String object's 'replace' method.

As for replacing the question mark (I was trying to find the URL of the current page up to the query string), it just needs escaping with a backslash, which is what you'd expect with a regular expression.  Once you stop being a plank and you turn the url into a String before calling .replace() on it, it all works fine. url_without_query_string = String(document.location).replace(/(#|\?).*/, '')





The Wonders of Google Suggest

First published on adamalton.co.uk on 29th March 2010.

I believe that these works of comedy genius speak for themselves.



 And the US version (google.com):





The Science of Zope & Plone Form Libraries

First published on adamalton.co.uk 6th March 2010.

Now then, if you've ever used one of the magical form libraries in Plone you'll no doubt have spent a fair while trying to work out how to use the damn thing, spent a good while looking for the ellusive documentation, found some out of date article written by a German guy, used his example whilst not having the faintest idea why or how it works, and then spent a fair amount of time fighting against the form library because you needed your form to do something that the form library just didn't want to do. During that time it may have occured to you that it would be quicker to just write the damn thing by hand.  Let's look at our graph below to find out.