Neo4j creates relationships using csv
I am trying to create a relationship between two types of nodes using the downloaded csv file. I have already created all movies and keywords. I also created indexes on: Movie (title) and: Keyword (word).
My csv file looks like this:
"title" | year | "word" // header
"In the Wild" | 2007 | "1990s" // string with name, year and keyword
"In the Wild" | 2007 | "abandoned bus"
My request:
LOAD CSV WITH HEADERS FROM "file:/home/gondil/temp.csv" AS csv
FIELDTERMINATOR '|'
MATCH (m:Movie {title:csv.title,year: toInt(csv.year)}), (k:Keyword {word:csv.word})
MERGE (m)-[:Has {weight:1}]->(k);
The request runs for about one hour and then displays an "Unknown error" error. What a redundant description of the error.
I thought it had to do with 160K keywords and over 1M movies and over 4M lines in csv. So I cut the csv down to one line and it still runs for about 15 minutes without stopping.
Where is the problem? How to write a query to create links between 2 already created nodes?
I can also delete all nodes and create another database, but it is best not to delete all nodes that are created.
Note. I shouldn't have any hardware problems because I use Super PC for our teachers.
source to share
It is imperative to have schema indexes to speed up the search for starting nodes. Before starting import run:
CREATE INDEX ON :Movie(title)
CREATE INDEX ON :Keyword(word)
Make sure the indexes are complete and online (check with command :schema
).
Repeat the Cypher command in two queries to use the indexes - now the index only consists of a label and one property:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/gondil/temp.csv" AS csv
FIELDTERMINATOR '|'
MERGE (m:Movie {title:csv.title })
ON CREATE SET m.year = toInt(csv.year)
MERGE (k:Keyword {word:csv.word})
second pass over the file
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/gondil/temp.csv" AS csv
FIELDTERMINATOR '|'
MATCH (m:Movie {title:csv.title })
MATCH (k:Keyword {word:csv.word})
MERGE (m)-[:Has {weight:1}]->(k);
source to share