Optional merge relationship

I'm new to Cypher and I'm trying to learn it through a small project I'm trying to set up.

I have the following data model: enter image description here

For each created, Thought

I connect Tags

through Categories

. Categories

serves only as an intermediate link between Tags

and Thoughts

, this is done to improve queries, avoid duplication Tag

and reduce links between objects.

To prevent a new one Tags

from being created with the same value, I thought of the following query:

CREATE (t: Thought {moment:timestamp(), message:'Testing new Thought'})
MERGE (t1: Tag{value: 'work'})
MERGE (t2: Tag{value: 'tasks'})
MERGE (t3: Tag{value: 'administration'})
MERGE (c: Category)
MERGE (t1)<-[u:CONSISTS_OF{index:0}]-(c)
MERGE (t2)<-[v:CONSISTS_OF{index:1}]-(c)
MERGE (t3)<-[w:CONSISTS_OF{index:2}]-(c)
MERGE (t)-[x:CATEGORIZED_AS{index: 0}]->(c)

      

This works great, except for one thing: Thought

gets the connection with everyone created Categories

. This I understand, I do not define any restrictions in the MERGE query.

However, I don't know how to apply constraints to a relation CATEGORIZED_AS

?

I tried adding this to the bottom of the request, but it doesn't work:

WHERE (t)-[x]->(c)

      

Any idea how to apply the constraint as I need it in my case?

EDIT:

I forgot to mention unique connection Category

: A category connects to a fixed set Tags

in a specific order.

For example, I have three tags:

  • Job
  • Tasks
  • Administration

The only way it Category

matches Thought

is if it Category

has the following relationship with Tags

:

  • work <- [: CONSISTS_OF {index: 0}] - (category)
  • tasks <- [: CONSISTS_OF {index: 1}] - (category)
  • administration <- [: CONSISTS_OF {index: 2}] - (category)

Any other order of relationships is invalid and a new one must be created Category

.

+3


source to share


2 answers


Problem: using MERGE

MERGE

will try to find the pattern in the chart if it finds the pattern it returns, otherwise it will try to create the whole pattern. This works individually for each proposal MERGE

. So this works fine and as expected for nodes (n:Tag)

, since you only need one tag for each word in the graph, but the problem is with the last one in your query when you try to combine a category.

What you want to do is try to find this (c:Category)

one that is related to these 3 nodes (t:Tag)

with these r.index properties in relation (:Tag)-[r:CONSISTS_OF]-()

. However, you are running four merge clauses that do the following:

MERGE (c: Category)

      

Find or create any node c

labeled `Category.

MERGE (t1)<-[u:CONSISTS_OF{index:0}]-(c)
MERGE (t2)<-[v:CONSISTS_OF{index:1}]-(c)
MERGE (t3)<-[w:CONSISTS_OF{index:2}]-(c)

      

Find or create a link between this node and t1

then t2

, t3

etc.

If you have to run this query and then change one of the tags to something else like "rest" and run the query again, you are expecting a new category to appear. But it won't be with the current request, it will just create a new tag and then find the existing (c:Category)

node in that first sentence MERGE

and create a link between it and the new tag. So, instead of having two categories, each associated with three tags (when using two tags together), you will have four tags in total, all associated with one category, with duplicate indices in your relationship.

So, you really want to use MERGE

to find a complex pattern like below.

MERGE (t1)<-[:CONSISTS_OF {index:0}]-(c:Category)-[:CONSISTS_OF {index:1}]->(t2),
  (t3)<-[:CONSISTS_OF {index:2}]-(c)

      

It's a shame that this will give you a syntax error as cypher cannot currently combine complex templates. So here's the creative bit.



Solution 1: Conditional execution with CASE

and FOREACH

(Easy)

This is quite handy for these situations, see the comment below. You will essentially split the merge, use OPTIONAL MATCH

to try and find the pattern, and then use a little trick in the cypher syntax for the CREATE

pattern if we don't find it.

CREATE (t: Thought {moment:timestamp(), message:'Testing new Thought'})
MERGE (t1:Tag{value: 'work'})
MERGE (t2:Tag{value: 'abayo'})
MERGE (t3:Tag{value: 'rest'})
WITH *
// we can't merge this category because it a complex pattern
// so, can we find it in the db?
OPTIONAL MATCH (t1)<-[:CONSISTS_OF {index:0}]-(c:Category)-[:CONSISTS_OF {index:1}]->(t2),
  (t3)<-[:CONSISTS_OF {index:2}]-(c)
// the CASE here works in conjunction with the foreach to 
// conditionally execute the create clause
WITH t, t1, t2, t3, c, CASE c WHEN NULL THEN [1] ELSE [] END AS make_cat
FOREACH (i IN make_cat |
  // if no such category exists, this code will run as c is null
  // if a category does exist, c will not be null, and so this won't run
  CREATE (t1)<-[:CONSISTS_OF {index:0}]-(new_cat:Category)-[:CONSISTS_OF {index:1}]->(t2),
    (t3)<-[:CONSISTS_OF {index:2}]-(new_cat)
)
// now we're not sure if we're referring to new_cat or cat
// remove variable c from scope
WITH t, t1, t2, t3
// and now match it, we know for sure now we'll find it
// alternatively, use conditional execution again here
MATCH (t1)<-[:CONSISTS_OF]-(c:Category)-[:CONSISTS_OF]->(t2),
  (t3)<-[:CONSISTS_OF]-(c)
// now we have the category, we definitely want 
// to create the relationship between the thought and the category
CREATE (t)-[:CATEGORIZED_AS]->(c)
RETURN *

      

Solution 2: Refactor your graph (hard)

I have not included the query here - although I can do it on demand, the alternative would be to refactor your graph to attach tags to the categories in a ring (or chain - with an end member marker) so you can merge the drawing right away without breaking it.

Since the categories are in order, you can express data like the ones below in one sentence MERGE

.

MERGE (c:Category)-[:CONSISTS_OF_TAG_SEQUENCE]->(t1)-[:NEXT_TAG_IN_SEQUENCE]->(t2)-[:NEXT_TAG_IN_SEQUENCE]->(t3)-[:NEXT_TAG_IN_SEQUENCE]->(c)

      

This may seem like a neat solution at first, but the problem is that since the tags will belong to multiple categories, if the tags are split between categories, you need to either:

  • create a composite index to identify categories and store it as a sequential relationship property so you know which relationship should follow your path (i.e. you can always find one and only one tag sequence for a category)
  • still bind each tag to the categories it is in and query that pattern (so you can find that single path like in # 1)
  • Use an intermediate node to achieve the same as 1 and 2
  • All of the above and more.

As you might have guessed, this will make your query a lot harder than it needs to be quite fast. It might be fun to try and might be fine for some use cases, but for now I'll stick with the simple solution!

+3


source


My solution to your problem is to ensure that each category has a unique, consistently reproducible identifier. In your case, add a field cid

or id

where this value matches the strings tag1<_>tag2<_>tag3<_>

. ( <_>

used because the probability that it is part of a tag is zero. If it _

is an invalid tag character replacing <_>

with _

would be just fine).

This way you can lock a node category without knowing anything about the nodes it is bound to. Essentially, the unique id is your merge logic. It can even be dynamically created in Cypher using reduce . Usually I also have a field value

as "print id value".



When running the final Cypher, you should only concatenate one node using the instance id, use the Set for non node-defining option, and then use Create Unique to make sure there is one and only one relationship between the nodes.

+1


source







All Articles