Neo4j query profiling: filter for db hits

I am curious how filters work in neo4j queries. They result in db hits (according to PROFILE) and it seems that they shouldn't.

Request example:

PROFILE MATCH (a:act)<-[r:relationship]-(n)
WHERE a.chapter='13' and a.year='2009'
RETURN r, n

      

  • NodeIndexSeek: (I created an index on the label act

    for the property chapter

    ) returns 6 rows.
  • Filter: a.year == {AUTOSTRING1}

    which results in 12dB shocks.

Why would one need to do any hits to the db if it already fetches 6 matching instances a

in earlier db reads, shouldn't it just filter them out without going back to do more db reads?

I realize that I am equating "db hits" with "db reads" here, which may not be accurate. If not, what exactly are the "dB hits"?

Finally, the number of db deletes caused by the filter seems to roughly match:

<number of filtering elements> * 2 * <number of already queried nodes to filter on>

      

where "number of filter elements" is the number of filters provided, that is

WHERE a.year='2009' and a.property_x='thing'

      

- two elements.

Thanks for any help.

EDIT: Here are the PROFILE and EXPLAIN results in the query. This is just an example request. I found behavior

filter db hits = <number of filtering elements> * 2 * <number of already queried nodes to filter on>

      

generally true in the queries I have fulfilled.

PROFILE MATCH (a: act) <- [r: CHILD_OF] - (n) WHERE a.chapter = '13 'AND a.year =' 2009 'RETURN r, n

8 rows
55 ms

Compiler CYPHER 2.2

Planner COST

Projection
  |
  +Expand(All)
    |
    +Filter
      |
      +NodeIndexSeek

+---------------+---------------+------+--------+-------------+---------------------------+
|      Operator | EstimatedRows | Rows | DbHits | Identifiers |                     Other |
+---------------+---------------+------+--------+-------------+---------------------------+
|    Projection |             1 |    8 |      0 |     a, n, r |                      r; n |
|   Expand(All) |             1 |    8 |      9 |     a, n, r |     (a)<-[r:CHILD_OF]-(n) |
|        Filter |             0 |    1 |     12 |           a | a.year == {  AUTOSTRING1} |
| NodeIndexSeek |             1 |    6 |      7 |           a |             :act(chapter) |
+---------------+---------------+------+--------+-------------+---------------------------+

Total database accesses: 28

      

EXPLAIN MATCH (a: act) <- [r: CHILD_OF] - (n) WHERE a.chapter = '13 'AND a.year =' 2009 'RETURN r, n

4 ms

Compiler CYPHER 2.2

Planner COST

Projection
  |
  +Expand(All)
    |
    +Filter
      |
      +NodeIndexSeek

+---------------+---------------+-------------+---------------------------+
|      Operator | EstimatedRows | Identifiers |                     Other |
+---------------+---------------+-------------+---------------------------+
|    Projection |             1 |     a, n, r |                      r; n |
|   Expand(All) |             1 |     a, n, r |     (a)<-[r:CHILD_OF]-(n) |
|        Filter |             0 |           a | a.year == {  AUTOSTRING1} |
| NodeIndexSeek |             1 |           a |             :act(chapter) |
+---------------+---------------+-------------+---------------------------+

Total database accesses: ?

      

+3


source to share


1 answer


Because reading node attribute (write) and read (write) are not the same db operation.

You are correct that the filter hit should be no more than 6. Usually Neo4j pulls filters and predicates at the earliest possible moment, so the filter should be filtered immediately after the index lookup.



In some situations (due to the predicate) it can only filter after finding paths, then the number of db hits can be equal to the number of paths tested.

Which version of Neo4j are you using? Can you share your full request plan?

+1


source







All Articles