Working on several projects at once can be a bit hectic at times. But it also creates the opportunity to stand back for a moment and notice shared characteristics. At the moment I am working on two nah three projects which all use Neo4J as a storage backend.
Creating web-based information systems with Neo4J is a heaven compared to other storage solutions. The ability to work schema-free and create relations between seemingly random objects is literal what the web is about. Setting yourself free from the constraints of SQL based solutions is like taking your first breath of fresh air after being trapped in a coal mine for 10 days.
One of the things I was doing in all projects was introducing an uuid field to almost everything I was storing.
Having an unique identifier is key in many applications as it is the quickest and most reliable way to reference a specific node. Obviously deep inside Neo every node has its own internal ID but that is not guaranteed to be persistent. It is somewhere in the Neo docs but Brian Underwood summarizes it very nicely on Stackoverflow:
STANDARD DISCLAIMER: Don't use internal Neo4j IDs for long-term entity identification. Future versions of Neo4j might shift these IDs around for performance purposes. Create your own unique ID property (ideally with a CONSTRAINT) for tracking entities – Brian Underwood May 20 at 15:58 http://stackoverflow.com/questions/22369520/neo4j-get-node-by-id
Admittedly in the first pieces of code I made, an administrative interface for a part of the Historiana project I actually used the id() of nodes to create edit forms and such. It worked fine but I switched to using my own uuid field quickly after. Apart from the “not guaranteed to persist” I wanted something less predictable than a incremental integer. For security by obscurity reasons obviously..
While working on a new Neo4J based project the need for yet another admin interface came quickly after migrating the data from MS-SQL to Neo4J. Another admin interface; another custom job I thought? At Historiana we created most screens manually, there a loads of attributes on the nodes and we learned from the project that although creating relations in the database is easy the creation of an easy to use admin interface to actually make those connections can be hard work. Honest work, but hard, and with many types of nodes a lot of work.
Although the new project model is small now it might grow quickly. Also the Historiana admin has some challenges, because everything is custom-coded not everything is as coherent as it could be. Editing a property is sometimes only possible in one view of the data while there are three places where it is used.
Previously we created some projects using Django and it’s generic admin still is one of the best features (as long as you stay away from customising it 😉 It could actually be very useful for these and future projects to have a “generic admin” which allows one to quickly add, modify and delete data in an easy way.
A first proof-of-concept of such an Admin is now done. One of the difficult issues is (well: was) the unique ID to get to the node. Using the uuid field is fine, but it has to be there. Neo4J is schemaless so there is no way of “seeing” which field is the unique index.
Worse; With the newly imported data there were quite some tables which not even had such an index. Should the admin tool create-and-fill uuid fields by itself? One thing I want to avoid is that it “changes” user-data.
By chance I stumbled upon the GraphAware UUID module. Although it alters user-data the enabling of it is a configuration in Neo4J itself. Installing and enabling is easy:
cd /usr/share/neo4j/plugins wget http://products.graphaware.com/download/framework-server-community/graphaware-server-community-all-2.3.1.36.jar wget http://products.graphaware.com/download/uuid/graphaware-uuid-2.3.1.36.7.jar
and add these lines to /etc/neo4j/neo4j.properties:
# https://github.com/graphaware/neo4j-uuid com.graphaware.runtime.enabled=true com.graphaware.module.UIDM.1=com.graphaware.module.uuid.UuidBootstrapper
Stop and restart Neo, it can take a while; all nodes not having an uuid will get a new one.
Two very easy steps but anyone doing them will be aware of the change and not surprised that there is a uuid property on every node. Much better than not some weird “admin” application which goes wild on the dataset.
Thanks Graphaware for a neat solution to a hairy problem 🙂 I can’t wait to experiment with the TimeTree too, but first there is more admin-code to be created 🙂