Wide ranks against collections in Kassandra

I am trying to model a many-to-many relationship in Cassandra as an "Item-User" relationship. The user can like many products and products that can be bought by many users. Let's also assume that the order in which the "as" event occurs is not a concern, and that the most commonly used query simply returns "likes" based on the element as well as the user.

There are several posts describing data modeling http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/

An alternative would be to store an ItemID collection in the User table to denote the items that user liked, and do something similar in the Items table in CQL3.

Questions

  • Are there any performance hits with collection usage? I think they translate to composite columns? So the reading pattern, caching and other factors should be similar?

  • Are collections less efficient for writing heavy applications? Is updating the collection less effective?

+3


source to share


1 answer


There are several advantages of using large strings over collections that I can think of:

  • The number of items allowed in the collection is 65535 (unsigned). If your collection can have more than many records in your collection, it is probably better, since this limit is much higher (2 billion cells (rows * columns) per section).
  • When reading a column of a collection, the entire collection is read every time. Compare this to a wide string, where you can limit the number of rows that are read in your query, or you can restrict the criteria of your query based on the clustering key (i.e. dates> 2015-07-01).


For your specific use case, I think that simulating the "items_by_user" table would be more ideal than a column list<item>

in the "users" table.

+1


source







All Articles