Cassandra v. 3, write performance issue

I have write performance issues with Cassandra 3.

I'm trying out Cassandra 3.3 using the official Docker image from here: https://github.com/docker-library/cassandra

I start it as follows:

docker run --net=host --rm cassandra:3.3

Then run cassandra-stress against it:

cassandra-stress write

This gives me the following results for four threads executing traffic:

op rate                   : 1913 [WRITE:1913]
partition rate            : 1913 [WRITE:1913]
row rate                  : 1913 [WRITE:1913]
latency mean              : 2.1 [WRITE:2.1]
latency median            : 1.6 [WRITE:1.6]
latency 95th percentile   : 4.1 [WRITE:4.1]
latency 99th percentile   : 8.4 [WRITE:8.4]
latency 99.9th percentile : 20.5 [WRITE:20.5]
latency max               : 155.4 [WRITE:155.4]
Total partitions          : 154607 [WRITE:154607]
Total errors              : 0 [WRITE:0]
total gc count            : 13
total gc mb               : 1951
total gc time (s)         : 1
avg gc time(ms)           : 59
stdev gc time(ms)         : 28
Total operation time      : 00:01:20

Doing the exact same thing for Cassandra 2.2 using the official image:

docker run --net=host --rm cassandra:2.2

Gives me the following result with four threads:

op rate                   : 2248 [WRITE:2248]
partition rate            : 2248 [WRITE:2248]
row rate                  : 2248 [WRITE:2248]
latency mean              : 1.8 [WRITE:1.8]
latency median            : 1.4 [WRITE:1.4]
latency 95th percentile   : 3.5 [WRITE:3.5]
latency 99th percentile   : 7.2 [WRITE:7.2]
latency 99.9th percentile : 16.4 [WRITE:16.4]
latency max               : 129.5 [WRITE:129.5]
Total partitions          : 195461 [WRITE:195461]
Total errors              : 0 [WRITE:0]
total gc count            : 11
total gc mb               : 1612
total gc time (s)         : 1
avg gc time(ms)           : 62
stdev gc time(ms)         : 21
Total operation time      : 00:01:26

The number of writes/s is almost 15 % lower on 3.3 compared to 2.2. What could the reason for that be? I've been trying to change various parameters including running Cassandra on Oracle (compared to OpenJDK) above with no significant differences. I've also tried different versions of Cassandra 3 with no real difference spotted. I know that this is a single node and that the results from this basic test cannot be transferred to a production setting. Still, I'm curious to know if anybody has an explanation or can reproduce the behaviour.

Any input is welcome!

Update 2016-04-13, JVM parameter differences:

diff jvm-param22-sorted jvm-param33-sorted

> -XX:+AlwaysPreTouch
# Before removing this from the 3.3 config:
# USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
# cassand+     1 19.0 30.3 2876304 1229136 ?     Ssl  04:18   0:17 java
#
# After removing it:
# USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
# cassand+   177 17.0  9.4 2885596 383972 ?      Sl+  04:59   0:17 java -
#
# The above could actually explain some of the issues that I've observed in 
# low memory environments with multiple containers running since I guess it
# means Cassandra will grab the whole heap at startup from the OS
# regardless of the actual need.    

# This was present twice in the 22 config. Should not matter.
< -XX:CMSWaitDuration=10000

# Removing this from the 3.3 config did not have any significant impact
> -XX:+ResizeTLAB

# Removing from the 3.3 config did not have any significant impact
> -XX:-UseBiasedLocking

# Adding this one to the 3.3 config did not have a significant impact
< -XX:+UseCondCardMark

回答1:

I'll just answer myself for future reference. The culprit in my case was the JVM parameter -XX:+AlwaysPreTouch which was added as default in the 3.0 release of Cassandra. Removing it puts the performance back on par with the 2.2 release.

I've mainly experimented in environments with fairly limited amount of RAM. I have not yet done any experiments on more powerful HW to see what the effects of this flag has there.

回答2:

How many times did you run the test ?

Keep in mind that a single run of 01:40 min is far from being comprehensive for a performance test