Neo4J - find nodes where related nodes are a subset

Question

Neo4J - find nodes where related nodes are a subset

I am very new to Neo4J and graphs.

If I have a very simple graph where node A requires 1 to many node Bs.

Is there an efficient way to find those node As where the Bs they are associated with is a subset of a list of dates.

for example given dataset:

typeA,rel,typeB

A1,REQUIRES,B1
A1,REQUIRES,B2
A1,REQUIRES,B3

A2,REQUIRES,B1
A2,REQUIRES,B4

A3,REQUIRES,B4

A4,REQUIRES,B5

I want to ask which of the As are fully covered by the given list of Bs

Examples:

 given B1,B2,B3 -> A1
 given B1,B3,B4 -> A2, A3
 given B1,B3,B4,B5 -> A2, A3, A4

If the specified list Bs does not contain all the Bs with which A is associated, then it should be excluded.

If there is an answer, will it scale to large numbers?

Thank.

+3

neo4j

prule 09 dec. 14 at 10:36

source to share

1 answer

cybersam · Answer 1 · 2014-12-13T08:52:16+0000

In this answer I am assuming that:

Nodes are marked as :A

and :B

and have the property id

.
- For example "A1" will be (:A {id: 1})
  
  .
You are passing a collection of :B

IDs of interest in a parameter {ids}

.

The next query should do what you want.

MATCH (a:A)-[:REQUIRES]->(b:B)
WHERE b.id IN {ids}
WITH DISTINCT a
MATCH (a)-[:REQUIRES]->(bb:B)
WITH a, COLLECT(bb) AS bbs
WHERE ALL(x IN bbs WHERE x.id IN {ids})
RETURN a.id

Here is a console that shows the results if the collection of :B

IDs of interest is [1, 3, 4, 5]

, which matches your last example, (Since the console does not support parameter passing, I hardcoded the identity collection in the request.)

Description of the request, in order:

(First 2 lines) Find all nodes :A

that require a :B

node with a collection id {ids}

.
Remove the duplicated nodes :A

so that we get different nodes :A

(which require one or more nodes :B

).
Find all the nodes :B

required for each of these nodes :A

. (Some of these sites :B

may not be of interest.)
Associate with each of these nodes a :A

set of all required nodes :B

.
Filter out all nodes :A

that require nodes :B

that are not of interest.
Return node IDs :A

that only require the nodes of :B

interest.

Assuming you are creating an index for :B(id)

, this query should be scalable.

Neo4J - find nodes where related nodes are a subset

More articles: