Neo4J - find nodes where related nodes are a subset

I am very new to Neo4J and graphs.

If I have a very simple graph where node A requires 1 to many node Bs.

Is there an efficient way to find those node As where the Bs they are associated with is a subset of a list of dates.

for example given dataset:

typeA,rel,typeB

A1,REQUIRES,B1
A1,REQUIRES,B2
A1,REQUIRES,B3

A2,REQUIRES,B1
A2,REQUIRES,B4

A3,REQUIRES,B4

A4,REQUIRES,B5

      

I want to ask which of the As are fully covered by the given list of Bs

Examples:

 given B1,B2,B3 -> A1
 given B1,B3,B4 -> A2, A3
 given B1,B3,B4,B5 -> A2, A3, A4

      

If the specified list Bs does not contain all the Bs with which A is associated, then it should be excluded.

If there is an answer, will it scale to large numbers?

Thank.

+3


source to share


1 answer


In this answer I am assuming that:

  • Nodes are marked as :A

    and :B

    and have the property id

    .
    • For example "A1" will be (:A {id: 1})

      .
  • You are passing a collection of :B

    IDs of interest in a parameter {ids}

    .

The next query should do what you want.

MATCH (a:A)-[:REQUIRES]->(b:B)
WHERE b.id IN {ids}
WITH DISTINCT a
MATCH (a)-[:REQUIRES]->(bb:B)
WITH a, COLLECT(bb) AS bbs
WHERE ALL(x IN bbs WHERE x.id IN {ids})
RETURN a.id

      



Here is a console that shows the results if the collection of :B

IDs of interest is [1, 3, 4, 5]

, which matches your last example, (Since the console does not support parameter passing, I hardcoded the identity collection in the request.)

Description of the request, in order:

  • (First 2 lines) Find all nodes :A

    that require a :B

    node with a collection id {ids}

    .
  • Remove the duplicated nodes :A

    so that we get different nodes :A

    (which require one or more nodes :B

    ).
  • Find all the nodes :B

    required for each of these nodes :A

    . (Some of these sites :B

    may not be of interest.)
  • Associate with each of these nodes a :A

    set of all required nodes :B

    .
  • Filter out all nodes :A

    that require nodes :B

    that are not of interest.
  • Return node IDs :A

    that only require the nodes of :B

    interest.

Assuming you are creating an index for :B(id)

, this query should be scalable.

+1


source







All Articles