According to the logs I did my first check-in of Neo4J related code on Mon May 19 16:49:11 2014 +0200
With hindsight I wished I had viewed the presentation by Nicole White How to build a Python web application with Flask and Neo4j – PyCon SE 2015 first. It is a real shame that I actually need a time-machine to fix that. 🙂
Her presentation gives a very clean and concise overview on how to use the Neo4J Graph database from Python. Before checking in my initial code back in May 2014 I used quite some time to research my options. That there was a need for a better solution for this particular project became clear earlier in 2014. The project requires a lot of different kinds of information to be connected to other bits, maintaining al these bits in a SQL database became a pain and it probably doesn’t surprise you that I found the Graph databases to be the answer.
After reading, evaluating and testing several options I found that Neo4J was probably the easiest way to get going. The existing project was made in Python using Django and I was looking for a similar environment but hooked up to the graph instead of a SQL system. Originally Django was chosen for its killer feature “comes with free admin system” and being very familiar with Python sure helped a lot. During the original project we got rid of the Django templating system and replaced it by Jinja because we needed more power inside the templates.
Suddenly you are left with an existing Django project that does not use the standard templating and without the need for the Django-ORM as you replaced the database. Why keep on using Django then? The decision was made quickly and like Nicole we selected Flask to be our new friend. As it uses Jinja by default and we could keep most of our logic it was an easy transition Python-code-wise.
For database connectivity there were, at the time a few options, Bulbflow was around and I played with it in order to evaluate some other database options, but for reason I might elaborate on in another blog it was not giving me the warm fuzzy feeling I was looking for. But there was a better option, Py2Neo provided a wrapper around the REST API and the examples actually worked without any further dependancies.
That settled most of the stack, but, and here comes the “model” from the title, it also mentioned another project, Neomodel. A ORM like abstraction using Py2Neo and creating a model similar to Django. Without a hitch I started using it and modeled the new parts of the project with it.
Now when you carefully watch the presentation of Nicole linked above you will notice that during the questions of the session she answers “No” on the ORM question, and a few moments later she refines it by stating she can’t vouch for any projects there might be. Well actually I can as currently we are still using it in the project.
Like any ORM it is a mix between a blessing and a curse. In this particular project it started as a blessing because I could model objects in a familiar way, take for example this object:
class ContentItem(StructuredNode):
uuid = StringProperty(default=uuid4, unique_index=True)
title = StringProperty(required=True)
slug = StringProperty(required=True, unique_index=True)
intro = StringProperty()
embed_code = StringProperty(default='')
thumb = Relationship('core.asset.Asset', 'ICONIC_IMAGE')
links = Relationship('core.link.Link', 'RELATED_LINKS')
materials = Relationship('core.asset.Asset', 'RELATED_MATERIALS')
downloads = Relationship('core.link.Link', 'RELATED_DOWNLOADS')
uploads = RelationshipTo('core.upload.Upload', 'CI_UPLOAD')
learning_activities = RelationshipTo('la.LearningActivity', 'RELATED_LA')
collections = Relationship('core.collection.Collection', 'RELATED_COLLECTIONS')
Creating an object using this class is as simple as create an instance of it and save it like so:
c = ContentItem(title='The Title', slug='the-slug')
c.save()
It saves you from all the trouble of learning Cypher, it just works. And it even allows you to nice things as custom default values like , uuid which gets automatically assigned. This looks fantastic, by putting all your stuff in classes like these you build and document your information model on the go. With hindsight I think it is especially the fact that you create a place with some sort of “formal definition” of all the stuff in the graph which is very comforting. With SQL you have to and parts of your brain are used to it, losing that is a mental roadblock not easily taken.
When starting with Neo it is nice to have it, and not needing to grasp Cypher all at once is also very comforting. The code in this project grew quickly and many classes were made representing the various objects needed. HTML forms for the administrative interface were coded and processed with ease.
Updating a node using Neomodel is easy:
c = ContentItem.nodes.get(uuid='some-uuid')
c.title = 'The new title'
c.save()
Very readable and quick to implement. Somewhere along the road some updates where needed as the syntax somewhat changed but that was a small price to pay. Along the way I started to use the Neo4J console firing away Cypher queries more and more. And as time passed by I realised that the model was getting in the way.
While complexity of the functionality increased I was spending more time trying to get the results via Neomodel in an efficient way. Getting the same results from a no-brainer-query in Cypher was adding a layer of complexity in the model.
Say I need the url for the thumbnail in the ContentItem above. As it is a relation and it is optional in order to safety call the get_thumb() method of the asset I need to do something like:
if c.thumb.single():
my_thumb = c.thumb.single().get_thumb()
else:
my_thumb = None
where get_thumb is a custom method for the Asset defined as:
def get_thumb(self):
if self.thumb:
return "/objects/%s/%s" % (self.uuid, self.thumb)
return None
Not very clean, but it really gets messy when trying to get to a property of a relation into the Asset via the ContentItem object. Where in Cypher you simply state which properties from which nodes should be returned you when using a tool like Neomodel you end up using a myriade of calls into objects. It might be a matter of taste, or perhaps I am just using it in the wrong domain but it just does not feel right.
It is interesting to notice that the more comfortable I got with Cypher the less attracting the Neomodel abstraction became. This combined with the fact that the current version of Neomodel requires an older version of Py2Neo I decided to bite the bullet and move away from it.
New projects are done without the Neomodel and it just works. Building and construction your stuff around Cypher queries without the need to put everything in an object mold is just so much more fun!