Upgrading from cassandra 2.1.4 to 2.1.5
All
A few days ago I upgraded our 6 node EC2 cluster from cassandra 2.1.4 to 2.1.5.
Since then, all my nodes have exploded in their cpu usage - theirs are 100% cpu most of the time, and their average load is 100-300 (!!!).
It didn't start immediately after the update. It started a few hours later from one of the nodes, and slowly, more and more nodes began to exhibit the same behavior. It seems to correlate with the compacting of our largest column family, and after the compact is complete (24 hours after starting it) the node seems to be back to normal. It's only been 2 days so I hope this doesn't happen again, but I'm still in control of it.
Here are my questions
- Is this a bug or expected behavior?
If this is the expected behavior -
- What is the explanation for this problem?
- I have documented somewhere what am I missing?
- Should I be doing updates differently? Maybe 1 or 2 knots every 24 hours or so? What's the best practice?
If this is a mistake -
- Is this known?
- Where can I report this? What data should be added?
- Will dropping to 2.1.4 work?
Any feedback on this would be great
thank
Amir
Update:
This is the structure of the table in question.
CREATE TABLE tbl1 (
key text PRIMARY KEY,
created_at timestamp,
customer_id bigint,
device_id bigint,
event text,
fail_count bigint,
generation bigint,
gr_id text,
imei text,
raw_post text,
"timestamp" timestamp
) WITH COMPACT STORAGE
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
The magazines don't show much (at least for me). Here is a snippet of what the magazines look like
INFO [WRITE- / 10.0.1.142] 2015-05-23 05: 43: 42 577 YamlConfigurationLoader.java:92 - Loading settings from file: /etc/cassandra/cassandra.yaml
INFO [WRITE- / 10.0.1.142] 2015-05-23 05: 43: 42,580 YamlConfigurationLoader.java: 135 - node Configuration: [authenticator = AllowAllAuthenticator; authorizer = AllowAllAuthorizer; auto_snapshot = TRUE; batch_size_warn_threshold_in_kb = 5; batchlog_replay_throttle_in_kb = 1024; broadcast_rpc_address = 10.0.2.145; cas_contention_timeout_in_ms = 1000; client_encryption_options =; cluster_name = cluster Gryphonet21; column_index_size_in_kb = 64; commit_failure_policy = stop; commitlog_directory = / data / Cassandra / commitlog; commitlog_segment_size_in_mb = 32; commitlog_sync = periodic; commitlog_sync_period_in_ms = 10000; compaction_throughput_mb_per_sec = 16; concurrent_counter_writes = 32; concurrent_reads = 32; concurrent_writes = 32; counter_cache_save_period = 7200; counter_cache_size_in_mb = NULL; counter_write_request_timeout_in_ms = 5000;cross_node_timeout = false; data_file_directories = [/ data / data / Cassandra]; disk_failure_policy = stop; dynamic_snitch_badness_threshold = 0.1; dynamic_snitch_reset_interval_in_ms = 600000; dynamic_snitch_update_interval_in_ms = 100; endpoint_snitch = GossipingPropertyFileSnitch; hinted_handoff_enabled = TRUE; hinted_handoff_throttle_in_kb = 1024; incremental_backups = false; index_summary_capacity_in_mb = NULL; index_summary_resize_interval_in_minutes = 60; inter_dc_tcp_nodelay = false; internode_compression = all; key_cache_save_period = 14400; key_cache_size_in_mb = NULL; max_hint_window_in_ms = 10800000; max_hints_delivery_threads = 2; memtable_allocation_type = heap_buffers; native_transport_port = 9042; num_tokens = 16; = RandomPartitioner markup; permissions_validity_in_ms = 2000; range_request_timeout_in_ms = 10000;read_request_timeout_in_ms = 5000; request_scheduler = org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms = 10000; row_cache_save_period = 0; row_cache_size_in_mb = 0; rpc_address = 0.0.0.0; rpc_keepalive = TRUE; rpc_port = 9160; rpc_server_type = sync; saved_caches_directory = / data / Cassandra / saved_caches; seed_provider = [{class_name = org.apache.cassandra.locator.SimpleSeedProvider, parameters = [{seed = 10.0.1.141,10.0.2.145,10.0.3.149}]}]; server_encryption_options =; snapshot_before_compaction = false; ssl_storage_port = 7001; sstable_preemptive_open_interval_in_mb = 50; start_native_transport = TRUE; start_rpc = TRUE; storage_port = 7000; thrift_framed_transport_size_in_mb = 15; tombstone_failure_threshold = 100000; tombstone_warn_threshold = 1000; trickle_fsync = false; trickle_fsync_interval_in_kb = 10240;truncate_request_timeout_in_ms = 60000; write_request_timeout_in_ms = 2000]
INFO [HANDSHAKE- / 10.0.1.142] 2015-05-23 05: 43: 42,591 OutboundTcpConnection.java:494 - Unable to transfer version from /10.0.1.142
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,713 MessagingService.java:887 - 135 MUTATION messages dropped in the last 5000ms
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,713 StatusLogger.java:51 - Pool name Active Pending Completed Blocked All Time Blocked
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,714 StatusLogger.java:66 - CounterMutationStage 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,714 StatusLogger.java:66 - ReadStage 5 1 5702809 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,715 StatusLogger.java:66 - RequestResponseStage 0 45 29528010 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,715 StatusLogger.java:66 - ReadRepairStage 0 0 997 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,715 StatusLogger.java:66 - MutationStage 0 31 43404309 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,716 StatusLogger.java:66 - GossipStage 0 0 569931 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,716 StatusLogger.java:66 - AntiEntropyStage 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,716 StatusLogger.java:66 - CacheCleanupExecutor 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,717 StatusLogger.java:66 - MigrationStage 0 0 9 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,829 StatusLogger.java:66 - ValidationExecutor 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,830 StatusLogger.java:66 - Sampler 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,830 StatusLogger.java:66 - MiscStage 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,831 StatusLogger.java:66 - CommitLogArchiver 0 0 0 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,831 StatusLogger.java:66 - MemtableFlushWriter 1 1 1756 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,831 StatusLogger.java:66 - PendingRangeCalculator 0 0 11 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,832 StatusLogger.java:66 - MemtableReclaimMemory 0 0 1756 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,832 StatusLogger.java:66 - MemtablePostFlush 1 2 3819 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,832 StatusLogger.java:66 - CompactionExecutor 2 32 742 0 0
INFO [ScheduledTasks: 1] 2015-05-23 05: 43: 42,833 StatusLogger.java:66 - InternalResponseStage 0 0 0 0 0
INFO [HANDSHAKE- / 10.0.1.142] 2015-05-23 05: 43: 45,086 OutboundTcpConnection.java:485 - Version for handshake from /10.0.1.142
UPDATE:
The problem persists. I thought that after one seal at every node ends up, the node goes back to normal, but it is not. After a few hours the processor jumps to 100% and averages the load in the 100-300 range.
I downgrade to 2.1.4.
UPDATE:
Used by the phact dumpThreads script to get the stack traces. Also tried using jvmtop but it just seemed to hang.
The output is too big to be pasted here, but you can find it at http://downloads.gryphonet.com/cassandra/ .
Username: cassandra Password: cassandra
source to share
Answering my own question -
We are using a very specific API - describe_splits_ex and this seems to be causing the problem. This is evident when looking at all the stack traces of all different threads, while the CPU utilization reaches 100%. It was easy for us to fix because we are using this api as an optimization and not a must, so we just stopped using it and the problem went away.
However, this api is also used by the cassandra-hadoop connector (at least it was used in earlier versions), so before upgrading to 2.1.5, if you are using the connector, I will test.
I don't know what change in 2.1.5 caused the problem, but I know it didn't happen in 2.1.4 and was consistently repeated in 2.1.5.
source to share