Cassandra 2.0.7 to 2.1.2 interchangeable stabilizers, sealing problems

We upgraded Cassandra (5 + 5 nodes) from 2.0.9 to 2.1.2 (binaries) and did one-time nodetool updates (bash script) after which we see some problems:

  • on each node we see about 50 "Pending Tasks" on one of them over 500, it persists for 5 days - when we ran nodetool upgradesstables even if concurrent_compactors is set to 8 cassandras never run more than 3-4 at the same time ... One node with more than 500 pending tasks has about 11k files in the column family directory ... we have 2 ssd drives, but during compaction there are up to 10MB / s and a maximum of 5MB / s write - even if compaction_throughput_mb_per_sec is set to 32 or 64 or 256

  • when updating tables in some tables: "WARN [RMI TCP Connection (100) -10.64.72.34] 2014-12-21 23: 53: 18,953 ColumnFamilyStore.java:2492 - Unable to undo pending transactions for reco_active_items_v1. Probably somewhere there an unusually large string, or the system is simply overloaded INFO [RMI TCP Connection (100) -10.64.72.34] 2014-12-21 23: 53: 18,953 CompactionManager.java:247 - Canceling an operation on reco_prod.reco_active_items_v1 after canceling other operations compaction "nodetool does not work with" Aborted sstables updates for atleast one column family in keypace reco_prod, check server logs for more information. "

  • on some nodes, the nodetool tables being updated succeed, but can still see the jb files in the column-family directory.

  • nodetool upgradestables for some nodes returns: Error: null - Stack Traces - java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableReader.cloneWithNewStart (SSTableReader.java:952) at org.apache.cassandra.io .sstable.SSTableRewriter.moveStarts (SSTableRewriter.java:250) at org.apache.cassandra.io.sstable.SSTableRewriter.switchWriter (SSTableRewriter.java:300) at org.apache.cassandra.io.sstable.SSTableRewriter.stable. .java: 186) at org.apache.cassandra.db.compaction.CompactionTask.runWith (CompactionTask.java:204) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow (DiskAwareRunnable.java:48) at org. apache.cassandra.utils.WrappedRunnable.run (WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal (CompactionTask.java:75) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute (AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager $ 4.execute (CompactionManager.java:340) at org.apache.cassandra.db.compaction.CompactionManager $ 2.call (CompactionManager.java:267) at java.util.concurrent.FutureTask.run (FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145) at java.util.concurrent .ThreadPoolExecutor $ Worker.run (ThreadPoolExecutor.java:615) at java.lang.Thread.run (Thread.java:745)run (FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor $ Worker.run (ThreadPoolExecutor.java:615) at java.lang.Thread .run (Thread.java:745)run (FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor $ Worker.run (ThreadPoolExecutor.java:615) at java.lang.Thread .run (Thread.java:745)

This is our production env (24 hours) and we are seeing higher load on nodes, higher read latency even over 1 second.

Any tips ...?

+3


source to share





All Articles