Limiting nodes per label
I have a graph that currently has about a few thousand nodes, with each node having between two and ten relationships. If we look at one node and its connections, they look something like this:
Alphabetic nodes are category nodes. All other nodes represent content nodes that are related associated with
to these category nodes, and their color indicates which tags (tags) are attached to it. For simplicity, each node has one label, and each node only connects to one other node:
- Blue: Categories
- Green: scientific publications
- Orange: General Articles
- Purple: blog posts
Now the simplest thing I am trying to do is get a certain number of related content nodes for a given node. The following returns all twenty related nodes:
START n = node(1)
MATCH (n)-->(category)<--(m)
RETURN m
However, I would like to filter this by 2 nodes per label of each category (and then play with the ordering of nodes that have multiple categories that overlap with the starting node.
I currently do this by getting the results from the above query and then manually iterating over the results, but it seems overkill to me.
Is there a way to do this via the Neo4j Cipher Query language?
source to share
This answer expands on @ Stefan's original answer to return a result for all categories, not just one of them.
START p = node(1)
MATCH (p)-->(category)<--(m)
WITH category, labels(m) as label, collect(m)[0..2] as nodes
UNWIND label as lbl
UNWIND nodes AS n
RETURN category, lbl, n
To make it easier to manually check the results, you can also add this line at the end to sort the results. (This sorting probably shouldn't be in your final code unless you really want the sorted results and want to spend extra computational time):
ORDER BY id(category), lbl
source to share
Cypher has a function labels
that returns an array with all the labels for a given node. Assuming you only have one label per m
node, the following approach might work:
START n = node(1)
MATCH (n)-->(category)<--(m)
WITH labels(m)[0] as label, collect[m][0..2] as nodes
UNWIND nodes as n
RETURN n
Operators WITH
create a separate collection of all nodes that have the same label. Using the index operator [0..2]
, the collection simply stores the first two elements. Unwind
then converts the collection to separate lines for the result. You can apply here.
source to share