Unlimited Cassandra CQL SELECT queries guaranteed to be grouped by section key?

I have a Cassandra table as shown below:

CREATE TABLE example
(
    result_id INT,
    evaluator_id INT,
    score DOUBLE,
    PRIMARY KEY(result_id, evaluator_id));
);

      

And the next request:

SELECT result_id, evaluator_id, score FROM example;

      

I understand that when querying for one partition key, the results will be sorted by the clustering key in a specific order. However, to support my data model, I am making the assumption that in the previous unrestricted query, the results will be grouped together using the "result_id" section, that is

for row in queryResults:
    resultId = row['result_id']
    if resultId == lastResultId:
        # append the score and evaluator id to a data structure
    else:
        # do something with the data structure, assuming we've now
        # received all scores for the given result_id
    lastResultId = resultId

      

Is this a valid assumption? This makes sense given the details of the store and works in prototype but doesn't seem to be guaranteed anywhere. For example, if I fetch data from multiple nodes, could hypothetical mixes of rows with different result IDs?

+3


source to share


1 answer


Is this a valid assumption?



Yes, results will always be grouped by section key. This is because all the CQL strings for a particular section are stored together on disk. CQL strings with the same key are hashed by the same token value and will be stored (together) on the nodes responsible for that particular token range.

+2


source







All Articles