DSE - Cassandra : Commit Log Disk Impact on Perfor

2019-06-07 16:50发布

问题:

I'm running a DSE 4.6.5 Cluster (Cassandra 2.0.14.352). Following datastax's guidelines, on every machine, I separated the data directory from the commitlog/saved caches directories:

  • data is on blazing fast drives
  • commit log and saved caches are on the system drives : 2 HDD RAID1

Monitoring disks with OpsCenter while performing intensive writes, I see no issue with the first, however I see the queue size from the later (commit log) averaging around 300 to 400 with spikes up to 700 requests. Of course the latency is also fairly high on theses drives ...

Is this affecting, the performance of my cluster ? Would you recommend putting the commit log and saved cache on a SSD ? separated from the system disks ?

Thanks.

Edit - Adding tpstats from one of nodes :

[root@dbc4 ~]# nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                         0         0          15938         0                 0
RequestResponseStage              0         0      154745533         0                 0
MutationStage                     1         0      306973172         0                 0
ReadRepairStage                   0         0            253         0                 0
ReplicateOnWriteStage             0         0              0         0                 0
GossipStage                       0         0         340298         0                 0
CacheCleanupExecutor              0         0              0         0                 0
MigrationStage                    0         0              0         0                 0
MemoryMeter                       1         1          36284         0                 0
FlushWriter                       0         0          23419         0               996
ValidationExecutor                0         0              0         0                 0
InternalResponseStage             0         0              0         0                 0
AntiEntropyStage                  0         0              0         0                 0
MemtablePostFlusher               0         0          27007         0                 0
MiscStage                         0         0              0         0                 0
PendingRangeCalculator            0         0              7         0                 0
CompactionExecutor                8        10           7400         0                 0
commitlog_archiver                0         0              0         0                 0
HintedHandoff                     0         1            222         0                 0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
PAGED_RANGE                  0
BINARY                       0
READ                         0
MUTATION                 49547
_TRACE                       0
REQUEST_RESPONSE             0
COUNTER_MUTATION             0

Edit 2 - sar output :

04:10:02 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:10:02 PM     all     22.25     26.33      1.93      0.48      0.00     49.02
04:20:01 PM     all     23.23     26.19      1.90      0.49      0.00     48.19
04:30:01 PM     all     23.71     26.44      1.90      0.49      0.00     47.45
04:40:01 PM     all     23.89     26.22      1.86      0.47      0.00     47.55
04:50:01 PM     all     23.58     26.13      1.88      0.53      0.00     47.88
Average:        all     21.60     26.12      1.71      0.56      0.00     50.01

回答1:

Monitoring disks with OpsCenter while performing intensive writes, I see no issue with the first,

Cassandra persists writes in memory (memtable) and on the commitlog (disk).

When the memtable size grows to a threshold, or when you manually trigger it, Cassandra will write everything to disk (flush the memtables).

To make sure your setup is capable of handling your workload try to manually flush all your memtables

nodetool flush

on a node. Or just a specific keyspace with

nodetool flush [keyspace] [columnfamilfy]

At the same time monitor your disks I/O.

If you have high I/O wait you can either share the workload by adding more nodes, or switch the data drives to better one with higher throughput.

Keep an eye to dropped mutations (can be other nodes sending the writes/hints) and dropped flush-writer.

I see the queue size from the later (commit log) averaging around 300 to 400 with spikes up to 700 requests.

This will probably be your writes to the commitlog. Is your hardware serving any other thing? Is it software raid? Do you have swap disabled?

Cassandra works best alone :) So yes, put at least, the commitlog on a separate (can be smaller) disk.