Cassandra - INSERT is no longer possible after a while

We are running a 3 node cassandra 2.1.5 installation on different debian wheezy VMs. Now we have a problem that after several days of working without problems, suddenly inserts into the table are no longer possible. Then you receive the following error message:

com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency QUORUM (2 responses were required but only 1 replica responded)
at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69)

      

When I execute the nodetool status on each node, it shows me that all nodes are in State = Up and State = Normal.

If I run nodetool repair on node I see thousands of exceptions like:

2015-06-03 16:40:58,023 ERROR [AntiEntropySessions:17] RepairSession.java:303 - [repair #858c8470-09fe-11e5-930b-d16ee278cb3a] session completed with the following error
java.io.IOException: Failed during snapshot creation.
    at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.5.jar:2.1.5]
    at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.5.jar:2.1.5]
    at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]

      

In the magazine. To restore cassandra, I need to restart the cassandra daemon on each node, and then do a nodetool repair on each node (which, after restarting the nodes, works without exception). Then it works again for 2-3 days until the same problem appears again.

Is this a known issue or what could be causing this behavior? To me it looks like the nodes can no longer talk to each other when an error occurs, but if so, why is it nodetool status

showing me UN (Up / Normal) for all nodes?

+3


source to share





All Articles