MongoEngine 0.3 Released - Harry Marr

3 min read Original article ↗

MongoEngine 0.3 Released

I just released version 0.3 of MongoEngine, here’s a quick breakdown of some of the main changes.

MapReduce Support

Thanks to the great work by Matt Dennewitz, we now have support for MapReduce. Here’s an example to show how it works, in which we generate frequencies of tags over a collection of blog posts:

class BlogPost(Document):
    title = StringField()
    tags = ListField(StringField())

BlogPost(title="Post #1", tags=['music', 'film', 'print']).save()
BlogPost(title="Post #2", tags=['music', 'film']).save()
BlogPost(title="Post #3", tags=['film', 'photography']).save()

map_f = """
    function() {
        this.tags.forEach(function(tag) {
            emit(tag, 1);
        });
    }
"""

reduce_f = """
    function(key, values) {
        var total = 0;
        for(var i=0; i<values.length; i++) {
            total += values[i];
        }
        return total;
    }
"""

# run a map/reduce operation spanning all posts
for result in BlogPost.objects.map_reduce(map_f, reduce_f):
    print '%s: %s' % (result.key, result.value)

# output:
# film: 3.0
# music: 2.0
# photography: 1.0
# print: 1.0

If the keys in the results correspond to _ids in the collection, you can access the relevant object by using result.object, which is lazily loaded.

New Fields

MongoEngine 0.3 sees the introduction of five new field types:

  • URLField - inherits from StringField, but validates URLs and optionally verifies their existence.
  • DictField - as the name suggests, it allows you to store Python dictionaries. When the structure of the dictionary is known, EmbeddedDocuments are preferred, but DictFields are useful for storing data where the structure isn’t known in advance.
  • GenericReferenceField - similar to the standard ReferenceField, but allows you to reference any type of Document.
  • DecimalField - a field capable of storing Python Decimal objects.
  • BinaryField - stores binary data.

New QuerySet Methods

  • only() - pass in field names as positional arguments, and only these fields will be retrieved from the database. Note that trying to access fields that haven’t been retrieved will return None as deferred fields have not yet been implemented.
  • in_bulk() - given a list of document ids, this will load all the corresponding documents and return a dictionary mapping the ids to the documents.
  • get(), get_or_create() - like first() these methods retrieve one matching document, but if more than one document matches the query, a MultipleObjectsReturned exception will be thrown. If get_or_create() is used and no matching document is found, a document will be created from the query.

String-Matching Query Operators

Six new query operators have been added: contains, startswith, endswith, and their case-insensitive variants, icontains, istartswith and iendswith. These are are just shortcuts for regular expression queries.

Other Fixes and Improvements

  • QuerySets now have a rewind() method, which is called automatically when the iterator is exhausted, allowing QuerySets to be reused.
  • ReferenceFields may now reference the document they are defined on (recursive references) and documents that have not yet been defined.
  • Field name substitution for Javascript code (allows the user to use the Python names for fields in JS, which are later substituted for the real field names).
  • The name parameter on fields has been replaced by the more descriptive db_field.

…and much more. For full details, see the changelog.