How to implement a nested tree in Neo4j?

I'm just getting started with Neo4j and Graph Databases and wondering if nested hierarchical trees are a good option for Neo4j. A common example is a nested set of comments. For example:

 - Article
  - Comment 1 on Article
     - Comment 2 on Comment 1
     - Comment 3 on Comment 1
         - Comment 4 on Comment 3
  - Comment 5 on Article


As I understand it, every article and comment will be nodes. And every comment will have a parent-child relationship. It would be easy to get all direct comments on the article (1 and 5). But how do you get the whole set?

Sorry for using layman terms. I thought this was better than trying to use appropriate words, confusing everyone.


source to share

2 answers

Ok, because trees are actually graphs , using a graphical database to store the tree seems absolutely appropriate to me. What works best will depend on your data access patterns, but basically trees are just a specialization of graphs.

Yes, in your case, "elements in the tree" will be nodes, and "nesting" will be relationships. So how could you come up with your example:

CREATE (root:Article {label: "Article"}),
       (c1:Comment {label: "Comment 1"}),
       (c1a:Comment {label: "Comment 2 on comment 1"}),
       (c1b:Comment {label: "Comment 3 on comment 1"}),
       (c1b1:Comment {label: "Comment 4 on comment 3"}),
       (c2:Comment {label: "Comment 5 on article"}),


This just creates a bunch of nodes and relationships, mimicking the structure you provided. Here I have always used :reply

to connect things.

Now "getting everything" just means intersecting all the :reply

relationships we created :

MATCH p=(a:Article {label: "Article"})<-[:reply*1..]-(otherThing) 
WITH nodes(p) as nodes
RETURN nodes[length(nodes)-2] as InResponseTo, 
       nodes[length(nodes)-1] as CommentOrReply;


What this query does is traverse any number of links :reply

(starting at the root "Article"). Then it only looks at the nodes in that path and returns the last element ( CommentOrReply

) and what was in response to (second last element).

The result looks like this:

| InResponseTo                             | CommentOrReply                           |
| Node[18]{label:"Article"}                | Node[19]{label:"Comment 1"}              |
| Node[19]{label:"Comment 1"}              | Node[20]{label:"Comment 2 on comment 1"} |
| Node[19]{label:"Comment 1"}              | Node[21]{label:"Comment 3 on comment 1"} |
| Node[21]{label:"Comment 3 on comment 1"} | Node[22]{label:"Comment 4 on comment 3"} |
| Node[18]{label:"Article"}                | Node[23]{label:"Comment 5 on article"}   |


So, how do you traverse the whole tree.

EDIT - for what it's worth, "variable length length matching" which in the query above was just this bit: <-[:reply*1..]-

for me is one of the top 3 selling points for a graph database. Exactly what DB graphs do very easily is that most of your other alternatives, such as relational ones, turn into excruciatingly painful exercises. So if you have to do something (like assembling the whole tree), I would argue that it is arguing for having a DB graph because you are using it in your area of ​​fundamental strength.



If nested trees are fully directed graphs and have no cycles (i.e. directed acyclic graph = DAG), then you might consider transitive closure methods in a relational database. They allow very fast queries through many levels of nesting and find multiple intersections. They have an n-square problem, so there are many rows, but with bigint indexes, queries are fast.



All Articles