Cassandra: choosing a range with the wrong result

Question

Cassandra: choosing a range with the wrong result

I have a problem with range selection in cassandra: it doesn't return all data for a while. This is cluster 2.1.0. Binaries are available from Apache.

This is my table:

CREATE TABLE metrics.main_cnt (
gran ascii,
ctx ascii,
io ascii,
eid uuid,
dt bigint,
apdex_s counter,
apdex_t counter,
"count" counter,
error counter,
time counter,
PRIMARY KEY ((gran, ctx, io, eid), dt))

I have many rows in this table and if I execute this query:

SELECT * from main_cnt WHERE gran = 'min' AND ctx ='A' AND io = 'i' AND eid =4379eec6-ba09-4f70-8862-1c864595c371 and dt in (1420644000000, 1420640400000);

I am getting this result:

 gran | ctx | io | eid                                  | dt            | apdex_s | apdex_t | count | error | time
------+-----+----+--------------------------------------+---------------+---------+---------+-------+-------+--------
  min |   A |  i | 4379eec6-ba09-4f70-8862-1c864595c371 | 1420640400000 |     671 |       4 |   677 |     0 | 168253
  min |   A |  i | 4379eec6-ba09-4f70-8862-1c864595c371 | 1420644000000 |     554 |      10 |   566 |     0 | 192666

But if I use range a then select:

SELECT * from main_cnt WHERE gran = 'min' AND ctx ='A' AND io = 'i' AND eid =4379eec6-ba09-4f70-8862-1c864595c371 and dt >= 1420640400000 and dt <= 1420644000000;

I only get one line:

 gran | ctx | io | eid                                  | dt            | apdex_s | apdex_t | count | error | time
------+-----+----+--------------------------------------+---------------+---------+---------+-------+-------+--------
  min |   A |  i | 4379eec6-ba09-4f70-8862-1c864595c371 | 1420640400000 |     671 |       4 |   677 |     0 | 168253

I also tried to increase the range, but with no better result. This is not the only case, but if I change the dt parameter, I get the correct result with multiple lines.

Repairing nodetool does not fix the problem.

I have not found any ticket in Jira for such an issue. Does anyone know about this issue? Thanks for any help.

Edit: more info: replication factor = 3 cluster has 8 or 9 nodes most of the time the increments are done with java driver 2.1.5 and prepared statements with this command: UPDATE main_cnt SET time = time + ?, \"count\" = \"count\" + ?, error = error + ?, apdex_s = apdex_s + ?, apdex_t = apdex_t + ? WHERE gran = ? AND dt = ? AND ctx = ? AND eid = ? AND io = ?

Trace for normal selection: trace1.log Trace for improper range selection: trace2.log

+3

cassandra

Thomas arnaud 08 june 15 at 15:50

source to share

1 answer

Thomas arnaud · Accepted Answer · 2015-07-26T12:39:05+0000

Not sure why, but the problem was fixed after upgrading the cluster to cassandra 2.1.8.

Cassandra: choosing a range with the wrong result

More articles: