While working on the crashed Neo4J database I needed to bootstrap the store-utils of Jexp fame to see what it would do with the crashed dataset.

The repository readme at https://github.com/jexp/store-utils gives all the details needed but for a non-Java developer I found there is some basic info lacking.

Therefore this quick outline to get it running on a clean and fresh 16.04.x LTS Ubuntu..

Note: when using a 2.x branch of Neo4J make sure you do a `apt install default-jdk` first. Otherwise the Neo4J installation process will pick up some OpenJDK version which causes issues later because it is compiled with some flags missing.

To install Neo:

wget -O - https://debian.neo4j.org/neotechnology.gpg.key | sudo apt-key add -
echo 'deb https://debian.neo4j.org/repo stable/' | sudo tee /etc/apt/sources.list.d/neo4j.list
sudo apt-get update
apt install neo4j=2.3.7
sudo apt install neo4j=2.3.7

This installs a specific version of the database; if you leave of the version info it will install the latest/greatest one. As I was working on a crashed dataset of an older version I needed this particular one. Check http://debian.neo4j.org/ for more info.

With Neo and Java installed we can proceed with the store-utils.

First of all; the store-utils are a Java tool which needs to be linked to the Neo4J library. So it is not a separate tool which gets installed and can handle any database. Instead an application is compiled with your specific dataset attached to it.

Most of the magic needed for this is automated; you just need Neo4J installed and Maven.

Not being a Java developer I actually never used Maven; I did understand that mvn is the commandline tool to start Maven though 🙂 But apt install mvn did not do anything. Stop googling as you will find all sorts on info on installing maven from source but you don’t need that. Just do a apt install maven. Life is sometimes that simple 🙂

At the repository you will find branches for every major release of Neo4J; you need to get the branch specific to your version of Neo4J! In my case the data was from a 2.3.3 installation so I got:

git clone -b 23 https://github.com/jexp/store-utils

With Maven already installed we are good to go; at the first run it fill fetch a zillion dependancies and then starts running. The copy-store.sh does all the magic for you:

./copy-store.sh data/graph.db foo

Note that you need it to point it to the graph.db folder of your data; the target directory will be created if it doesn’t exists.

You always can try leaving them but it is probably a good idea to remove the transaction logs `neostore.transaction.db.*`

If all goes well output looks like:

 

Copying from checkit/data/graph.db to xx ingoring rel-types [] ignoring properties [] ignoring labels []
.................................................. 500000 / 10510372 (4%)
 unused 497555.................................................. 1000000 / 10510372 (9%)
 unused 997555.................................................. 1500000 / 10510372 (14%)
 unused 1497555.................................................. 2000000 / 10510372 (19%)
 unused 1997555.................................................. 2500000 / 10510372 (23%)
 unused 2497555.................................................. 3000000 / 10510372 (28%)
 unused 2996675.................................................. 3500000 / 10510372 (33%)
 unused 3496675.................................................. 4000000 / 10510372 (38%)
 unused 3996675.................................................. 4500000 / 10510372 (42%)
 unused 4496675.................................................. 5000000 / 10510372 (47%)
 unused 4996675.................................................. 5500000 / 10510372 (52%)
 unused 5496675.................................................. 6000000 / 10510372 (57%)
 unused 5996675.................................................. 6500000 / 10510372 (61%)
 unused 6496675.................................................. 7000000 / 10510372 (66%)
 unused 6996675.................................................. 7500000 / 10510372 (71%)
 unused 7496675.................................................. 8000000 / 10510372 (76%)
 unused 7996675.................................................. 8500000 / 10510372 (80%)
 unused 8496675.................................................. 9000000 / 10510372 (85%)
 unused 8996675.................................................. 9500000 / 10510372 (90%)
 unused 9496675.................................................. 10000000 / 10510372 (95%)
 unused 9996675.................................................. 10500000 / 10510372 (99%)
 unused 10496387.
 copying of 10510373 node records took 1 seconds (10510373 rec/s). Unused Records 10497471 (99%)
.........
 copying of 94091 relationship records took 1 seconds (94091 rec/s). Unused Records 64070 (68%)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 13.562 s
[INFO] Finished at: 2017-08-26T12:01:05+02:00
[INFO] Final Memory: 11M/4096M
[INFO] ------------------------------------------------------------------------

 

The readme in the repository documents all the other stuff you might need, a backgrounder is at http://www.jexp.de/blog/html/store_copy.html

Michael, thanks for creating this tool; it helped a lot in certifying that the recovered data was actually ok 🙂