Cassandra + Mysql

Hi I'm new to Cassandra. I am a bit confused about DB design in the below scenario.

I currently have 3 tables: Post, User, PostLike.

Publish : save the post information

User : save user information

PostLIke:

CREATE TABLE PostLike (
    like_time timestamp
    post_id bigint,
    user_id bigint,
    PRIMARY KEY (like_time,post_id,user_id)
);

      

like_time: Used to store the mail order as time. cassandra provide this in the OrderPreservingPartitioner

Demand

:

  • The id of all users, which, like the given post, is sent using like_time and receives them: select * from PostLike, where post_id =?

  • All posts liked by the user select * from PostLike where user_id =?: He gave an error

[Invalid request] message = "PRIMARY KEY column" post_id "cannot be constrained (previous column" ColumnDefinition {name = user_id, type = org.apache.cassandra.db.marshal.LongType, kind = CLUSTERING_COLUMN, componentIndex = 0, indexName = null, indexType = null} "is either not constrained or a non-EQ relationship)"

Pls suggest what i need to do here:

  • need to use MySQL with Cassandra for this relationship

    OR

Create 2 separate tables in cassandra

CREATE TABLE PostLike (
    like_time timestamp
    post_id bigint,
    PRIMARY KEY (like_date,post_id)
);

CREATE TABLE UserLike (
    like_time timestamp
    user_id bigint,
    PRIMARY KEY (like_date,user_id)
);

      

or any other solution. Please help.

+3


source to share


2 answers


First of all, you are getting this error because you specified the second part of the primary key without specifying the first part. When querying in Cassandra with a complex primary key, you cannot skip parts of the key. You can leave the parts from the end of the key (as with the request with the partitioning key (see below), but it won't work if you try to skip the parts of the key.

Then secondary indexes don't work the same in Cassandra as they do in MySQL. In Kassandra, they are provided for convenience, not for work. Power post_id

and user_id

will probably be too high to be effective. Especially in a large cluster with millions of rows, the performance of the secondary index query will be significantly degraded over the high cardinality secondary index.

The correct way to solve this problem is to use the second option (as etherbunny recommended), but with the reordering of your primary keys.

CREATE TABLE PostLike (
    like_time timestamp
    post_id bigint,
    PRIMARY KEY (post_id,like_date)
);

CREATE TABLE UserLike (
    like_time timestamp
    user_id bigint,
    PRIMARY KEY (user_id,like_date)
);

      



The first key of the Cassandra primary key is known as the split key. This key will determine in which token range your string will be stored.

The rest of Cassandra's primary key keys are known as clustering columns . Cluster columns help determine the sort order on disk within the partitioning key .

This last part is important because it (the clustering order as well as the keyword ORDER BY

) behaves very differently than MySQL or any RDBMS. So, if you are SELECT * FROM user_like WHERE user_id=34574398 ORDER BY like_date

, you should see similar user_id s for this, ordered by like_date. In fact, even without the sentence ORDER BY

, they still need to be sorted using like_date. However, if you were SELECT * FROM user_like ORDER BY like_date

, your data will not be sorted in the expected order, because the order only works when you specify the partitioning key.

+2


source


Below errors resolve if i create index.

CREATE INDEX post_id_PostLike_indx ON post_like (post_id);
CREATE INDEX user_id_PostLike_indx ON post_like (user_id);

      



[Invalid request] message = "PRIMARY KEY column" post_id "cannot be constrained (previous column" ColumnDefinition {name = user_id, type = org.apache.cassandra.db.marshal.LongType, kind = CLUSTERING_COLUMN, componentIndex = 0, indexName = null, indexType = null} "is either not constrained or a non-EQ relationship)"

+2


source







All Articles