Efficiently clean up Neo4j database

This is my previous question Clear Neo4j Embedded Database

Now I understand that I do not need to close the database, I just need to erase all data inside this database.

I am using the following method:

public static void cleanDb(Neo4jTemplate template) {
    template.query("MATCH (n) OPTIONAL MATCH (n)-[r]-() DELETE n,r", null);
}

      

but it doesn't work as expected on large datasets.

Also, with the new version of Spring Data Neo4j, I cannot use Neo4jHelper.cleanDb (db);

Is there a way to properly and efficiently clean up the state of the database without shutting down / dropping the database?

UPDATED

I have implemented the following util class with cleanDb

method

public class Neo4jUtils {

    final static Logger logger = LoggerFactory.getLogger(Neo4jUtils.class);

    private static final int BATCH_SIZE = 10;

    public static void cleanDb(Neo4jTemplate template) {
        logger.info("Cleaning database");

        long count = 0;
        do {
            GraphDatabaseService graphDatabaseService = template.getGraphDatabaseService();
            Transaction tx = graphDatabaseService.beginTx();
            try {
                Result<Map<String, Object>> result = template.query("MATCH (n) WITH n LIMIT " + BATCH_SIZE + " OPTIONAL MATCH (n)-[r]-() DELETE n, r RETURN count(n) as count", null);
                count = (long) result.single().get("count");
                tx.success();
                logger.info("count: " + count);
            } catch (Throwable th) {
                logger.error("Error while deleting database", th);
                throw th;
            } finally {
                tx.close();
            }
        } while (count > 0);

    }

}

      

right now it hangs on the line:

tx.close();

      

How do I fix this, what am I doing wrong?

Also, after some experimentation, I noticed that I can clean up the database as many times as I want in a production application only. Immediately after restarting the application (I kill the application process from the console), the cleanDb

method stops working with that existing database and hangs.

No problem in message.log, everything looks fine:

2015-07-25 23:06:59.285+0000 INFO  [o.n.k.EmbeddedGraphDatabase]: Database is now ready

      

I don't know what could be wrong. Please help solve this problem.

I use:

neo4j version 2.2.3
lucene version 3.6.2
spring-data-neo4j version 3.4.0.M1

      

IMPORTANT UPDATE

I noticed that everything works correctly if I use the method before my application finishes. Otherwise the database is destroyed (Neo4j server also hangs this corrupted db). graphDatabaseService.shutdown();

Is it possible to make more complex fault tolerance of the Neo4j Embedded database? I will lose all my data after the first error (like a blackout event) in production.

0


source to share


1 answer


I don't know how this works with Spring Data, but in general you should try to remove nodes / relationships in packages.

Cypher Request:

MATCH (n)
WITH n LIMIT 10000
OPTIONAL MATCH (n)-[r]-()
DELETE n, r
RETURN count(n)

      



In your application, you:

while return_value > 0:
    run_delete_query()      

      

Depending on your memory, you can of course increase LIMIT

.

0


source







All Articles