How can I model multiple many-many relationships in Cassandra?

I read in Cassandra, I did some tutorials and played with CQL, but now that it was time for me to design the schema, I was having difficulties.

I am trying to create a schema that will handle the following use case. I need to keep track of the workers who attend the meetings and the topics they discuss in those meetings. Thus, a meeting can have multiple employees, each meeting discusses multiple topics, and each worker can create multiple topics. These are the data fields:

Worker : Worker ID, Worker Name

Meeting : meeting ID, meeting name, meeting time

Topic : topic id, topic name, creator

I need queries to see:

  • who is attending the meeting?
  • What meetings have the employee attended in the past?
  • what topics did the worker create?
  • what meetings discussed a specific topic?

So how should the circuit look like with this? I feel like it shouldn't be that hard, but I can't figure out when I start making tables.

+3


source to share


1 answer


It's important to remember that Cassandra data modeling is a query driven exercise. Since you have four queries to complete above, you can create four tables, one for each query you require.

I want you to be able to learn, so I won't do everything for you. But this is how I would solve queries # 1 and # 2. For # 1, I would create a table like this:

CREATE TABLE meetingAttendance (
  meetingID uuid,
  meetingName text,
  meetingTime timestamp,
  workerID uuid,
  workerName text,
  PRIMARY KEY ((meetingID),workerName));

      

I will go with meetingID

as the partition key and I will put on workerName

so they come back in order.

For query # 2, I'll create a query table like this:

CREATE TABLE meetingsByWorker (
  workerID uuid,
  workerName text,
  meetingID uuid,
  meetingName text,
  meetingTime timestamp,
  topicID uuid,
  topicName text,
  PRIMARY KEY ((workerID),meetingTime))
WITH CLUSTERING ORDER BY (meetingtime DESC);

      



When we request meetings that a particular employee attended, I break into workerID

. Since meetings are time-based, it makes sense to sort them by meetingTime

. By default they sort the ASC

final order, but the historical data usually makes sense to look at the end DESC

, so I define a specific CLUSTERING ORDER and sort ( DESC

).

After INSERTING some rows in both tables, I can query the attendance for a specific meeting like this:

aploetz@cqlsh:stackoverflow2> SELECT * FROM meetingattendance 
    WHERE meetingid=031e457b-2660-448b-a1d5-68c6cce3a820;

 meetingid                            | workername    | meetingname        | meetingtime              | workerid
--------------------------------------+---------------+--------------------+--------------------------+--------------------------------------
 031e457b-2660-448b-a1d5-68c6cce3a820 |         David | Project Prometheus | 2093-12-25 08:08:00-0600 | b83cbec4-95e5-4457-b037-c28c51d00418
 031e457b-2660-448b-a1d5-68c6cce3a820 | Holloway, Dr. | Project Prometheus | 2093-12-25 08:08:00-0600 | d28b4ee8-b1b9-401a-88d4-bc6b9727d712
 031e457b-2660-448b-a1d5-68c6cce3a820 |  Janek, Capt. | Project Prometheus | 2093-12-25 08:08:00-0600 | ebccf3ba-c1d2-4503-b717-897c7e89d968
 031e457b-2660-448b-a1d5-68c6cce3a820 |     Shaw, Dr. | Project Prometheus | 2093-12-25 08:08:00-0600 | c0e3e560-2332-4a46-9fdf-68bdb31abcb2
 031e457b-2660-448b-a1d5-68c6cce3a820 |       Vickers | Project Prometheus | 2093-12-25 08:08:00-0600 | 77cb9f64-3cb8-43f9-ab0c-b907b01c4404

(5 rows)
aploetz@cqlsh:stackoverflow2> SELECT * FROM meetingattendance
    WHERE meetingid=c7cea773-4c99-445f-928d-5b8a511c843b;

 meetingid                            | workername | meetingname      | meetingtime              | workerid
--------------------------------------+------------+------------------+--------------------------+--------------------------------------
 c7cea773-4c99-445f-928d-5b8a511c843b |      David | Wake Mr. Weyland | 2093-12-29 13:01:00-0600 | b83cbec4-95e5-4457-b037-c28c51d00418
 c7cea773-4c99-445f-928d-5b8a511c843b |  Ford, Dr. | Wake Mr. Weyland | 2093-12-29 13:01:00-0600 | 939657c2-e0cb-4a61-87d8-2a1739161d2a
 c7cea773-4c99-445f-928d-5b8a511c843b |    Vickers | Wake Mr. Weyland | 2093-12-29 13:01:00-0600 | 77cb9f64-3cb8-43f9-ab0c-b907b01c4404
 c7cea773-4c99-445f-928d-5b8a511c843b |    Weyland | Wake Mr. Weyland | 2093-12-29 13:01:00-0600 | 306955b8-c7ee-4350-8aa4-4c5d64487d74

(4 rows)

      

Now, if I want to see what the meeting visited a particular work, I can ask this workerID

:

aploetz@cqlsh:stackoverflow2> SELECT workername, meetingtime, meetingid, meetingname
    FROM meetingsbyworker WHERE workerid=77cb9f64-3cb8-43f9-ab0c-b907b01c4404;

 workername | meetingtime              | meetingid                            | meetingname
------------+--------------------------+--------------------------------------+--------------------
    Vickers | 2093-12-29 13:01:00-0600 | c7cea773-4c99-445f-928d-5b8a511c843b |   Wake Mr. Weyland
    Vickers | 2093-12-26 18:22:00-0600 | 3ea1282b-a465-4626-bd76-c65dd17b9f26 |   Head Examination
    Vickers | 2093-12-25 08:08:00-0600 | 031e457b-2660-448b-a1d5-68c6cce3a820 | Project Prometheus

(3 rows)
aploetz@cqlsh:stackoverflow2> SELECT workername, meetingtime, meetingid, meetingname
    FROM meetingsbyworker WHERE workerid=939657c2-e0cb-4a61-87d8-2a1739161d2a;

 workername | meetingtime              | meetingid                            | meetingname
------------+--------------------------+--------------------------------------+------------------
  Ford, Dr. | 2093-12-29 13:01:00-0600 | c7cea773-4c99-445f-928d-5b8a511c843b | Wake Mr. Weyland
  Ford, Dr. | 2093-12-26 18:22:00-0600 | 3ea1282b-a465-4626-bd76-c65dd17b9f26 | Head Examination

(2 rows)

      

Note that the data has been denormalized and that some of the column values ​​look redundant. If you decide that you still need entity tables for things like worker, that's fine too. But ask yourself again how often and exactly how you plan to query these tables. The last two solutions should be easily resolved using a similar approach.

+4


source







All Articles