Cypher - Neo4j Query Profiling

I have some questions regarding Quo Neo4j profiling. Consider below a simple Cypher query:

PROFILE 
MATCH (n:Consumer {mobileNumber: "yyyyyyyyy"}),
      (m:Consumer {mobileNumber: "xxxxxxxxxxx"}) 
WITH n,m 
MATCH (n)-[r:HAS_CONTACT]->(m) 
RETURN n,m,r;

      

and the output is:

enter image description here

So, according to Neo4j Documentation :

3.7.2.2. Expand to

When both the beginning and the end of a node have already been found, expand in is used to find all the connecting relationships between two nodes.

Query.

MATCH (p:Person { name: 'me' })-[:FRIENDS_WITH]->(fof)-->(p) RETURN
> fof

      

So, here in the above query (in my case), first of all, it must find both StartNode and EndNode before finding any relationship. But unfortunately, it just finds the StartNode and then expands all the associated relationships :HAS_CONTACT

, which results in the "Expand Into" statement not being used. Why does it work like this? There is only one link between two nodes :HAS_CONTACT

. There :Consumer{mobileNumber}

is no unique index limitation. Why does the above query expand all 7 relationships?

Another question about the Filter operator : why does it require 12 dB even though all the nodes / ratios have already been obtained? Why does this operation require 12dB calls for only 6 lines?

Edited

This is the complete graph I'm asking for: Graph data

Also I tested different versions of the same query, but the same query result is returned:

1

PROFILE
 MATCH (n:Consumer{mobileNumber: "yyyyyyyyy"})
 MATCH (m:Consumer{mobileNumber: "xxxxxxxxxxx"}) 
 WITH n,m 
 MATCH (n)-[r:HAS_CONTACT]->(m) 
 RETURN n,m,r;

      

2

PROFILE
 MATCH (n:Consumer{mobileNumber: "yyyyyyyyy"}), (m:Consumer{mobileNumber: "xxxxxxxxxxx"}) 
 WITH n,m 
 MATCH (n)-[r:HAS_CONTACT]->(m) 
 RETURN n,m,r;

      

3

PROFILE 
MATCH (n:Consumer{mobileNumber: "yyyyyyyyy"}) 
WITH n 
MATCH (n)-[r:HAS_CONTACT]->(m:Consumer{mobileNumber: "xxxxxxxxxxx"}) 
RETURN n,m,r;

      

+3


source to share


2 answers


The query being executed and the example provided in the Neo4j documentation for Deploy to are not the same. An example request starts and ends with the same node.

If you want the scheduler to find both nodes first and see if there is a connection, you can use shortestPath

with a length of 1 to minimize DB hits.



PROFILE 
MATCH (n:Consumer {mobileNumber: "yyyyyyyyy"}),
  (m:Consumer {mobileNumber: "xxxxxxxxxxx"}) 
WITH n,m 
MATCH Path=shortestPath((n)-[r:HAS_CONTACT*1]->(m))
RETURN n,m,r;

      

+2


source


Why is this being done?

It looks like this behavior has to do with the way the query planner searches the database in response to your cypher query. Cypher provides an interface for finding and performing chart operations (alternatives include Java API, etc.), Requests are processed by a query planner and then turned into chart operations using neo4j internals. It makes sense that the query planner will find what is likely to be the most efficient way to search the graph (hence why we love neo), and so just because the cypher query is written in one way, it will not necessarily search the graph in that how we imagine it will be in our head.

The documentation for this indication is a little sparse (or rather I couldn't find it properly), any links or further explanations would be much appreciated.

Looking at your query, I think you are trying to say this:

"Find two nodes, each labeled :Consumer

, n and m, with pin numbers x and y respectively, using the index mobileNumber

. If you find them, try to find the -[:HAS_CONTACT]->

relationship from n

to m

. If you find a relationship, return both nodes and the relationship, return nothing. "

Running this query this way requires creating a Cartesian product (i.e. a small table of all combinations n

and m

- in this case only one row, but potentially much more for other queries) and then the relationships to be searched between each of those rows.

Instead, since a statement must be executed to continue executing the query MATCH

, neo knows there are two nodes n

and m

must be connected through a relationship -[:HAS_CONTACT]->

if the query is to return anything. So the most efficient way to run a query (and avoid a Cartesian product) is presented below, which simplifies your query.

"Find a node n

with a label :Consumer

and an x ​​value for the index mobileNumber

that connects through -[:HAS_CONTACT]->

relationshop to the node m

using :Consumer

label and a y value for its proprerty mobileNumber

. Return both the node and the relation, otherwise return nothing."



So instead of doing two index searches, a Cartesian product and a set of expand operations, neo only performs one index lookup, expand all, and filter.

You can see the effect of this simplification by the query planner by using parameters AUTOSTRING

in your query profile.

How to change the query to implement the search as desired

If you want to modify the query so that it should use an extension in a relationship, set the requirement for the relationship optionally, or use explicit iteration. Both of these queries below will give the originally expected query profiles.

Additional example:

PROFILE
 MATCH (n:Consumer{mobileNumber: "xxx"})
 MATCH (m:Consumer{mobileNumber: "yyy"}) 
 WITH n,m 
 OPTIONAL MATCH (n)-[r:HAS_CONTACT]->(m) 
 RETURN n,m,r;

      

Iterative example:

PROFILE
 MATCH (n1:Consumer{mobileNumber: "xxx"})
 MATCH (m:Consumer{mobileNumber: "yyy"}) 
 UNWIND COLLECT(n1) AS n
 MATCH (n)-[r:HAS_CONTACT]->(m) 
 RETURN n,m,r;

      

+2


source







All Articles