Yesterday, we released another major neo4django milestone. You can get it from PyPi or GitHub.

Because the library is not feature complete- in particular, the lack of relationship models is a problem for many Neo4j users- the milestone is merely a minor revision number. This milestone is important for a few reasons, however.

Performance Focused

select_related()

We’ve implemented select_related(), a Django QuerySet feature that allows a developer to request the ORM to follow database relations, join the results, and return all data in one query.

In a graph database, this is even more important- because, in all likelihood, you’re using a graph approach due to a particularly connected domain model. select_related() enables traversals in a way already familiar to Django developers. Consider, to get all friends of a friend in a simple domain model:

max = Person.objects.all().select_related(friends__friends).get(name=Max)
#OR, pulling in all relationships of depth 2
max = Person.objects.all().select_related(depth=2).get(name=Max)

Access the returned Person object as you normally would.

for f in max.friends.all():
    for foaf in f.friends.all():
        print foaf.name

...and there won’t be any additional database calls.

Gremlin, Cypher, and Concurrency

We’re beginning to replace standard REST calls with Cypher, Gremlin, and the batch REST API, a process I'll post more about soon. This is painstaking, but now that it’s begun, expect serious performance improvements and approaching transactionality.

Gremlin has let us start treating the in-graph type hierarchy transactionally. Practically, this means that threading can now be used with the library, providing a huge performance benefit. While Python threads are limited by the GIL, they slice right through the I/O bound use of the REST interface. That said, most of the library is not sanely transactional, so test thoroughly. And, of course, concurrency needs for individual applications vary significantly.

Performance Caveats

Populating a graph using the library is still slow. In fact, populating a Neo4j database without using the batch inserter is always slow- it’s much worse doing the whole thing over many HTTP calls. Threading can now help quite a bit with this problem. Know that we’re on it, since it will seriously impede our own product down the line.

neo4django doesn’t yet play well with auto-indexing, but that’s another feature we plan to implement soon.

Using the Library

This is the first release where I can recommend you try neo4django- at Scholrly, we’re preparing to release our own alpha partially based on it. While there are many performance gains to be had, the library is only getting better, and will grow with our company.

If you or your company is thinking about using Neo4j and Django, let me know if you have any questions. I’m available to consult on the side or implement particular features you need.

Let me know in the comments or on GitHub Issues what features you care about, and please let us know if you’d like to contribute! We have our own feature agenda, but I want to make this library a real community boon, and that means I need your feedback.