Stress Cassandra instance on EC2 from local

2019-06-14 13:33发布

问题:

I would appreciate some help on how to stress a Cassandra instance running on EC2 from my local machine (using cassandra-stress util).

  • Cluster on EC2: Five nodes running DSE 4.6.
  • Local machine: cassandra-stress as included in Cassandra 2.1.2.

After changing the Security Group the stress util invoked from my local machine is able to connect to the given instance on EC2.

I allowed inbound TCP connections on Ports 9160 and 9042 from my local machine's IP.

sh cassandra-stress write -node 54.xxx.197.xxx

Output is:

ec0007:bin planger$ sh cassandra-stress write -node 54.xxx.197.xxx
Unable to create stress keyspace: Keyspace names must be case-insensitively unique ("Keyspace1" conflicts with "Keyspace1")
Warming up WRITE with 50000 iterations...
WARN  13:45:52 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.33) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:52 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.34) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:52 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.35) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:52 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.32) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:53 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.33) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:53 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.34) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:53 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.35) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN  13:45:53 Found host with 0.0.0.0 as rpc_address, using listen_address (/172.31.33.32) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
INFO  13:45:53 Using data-center name 'Cassandra' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO  13:45:53 New Cassandra host /172.31.33.33:9042 added
INFO  13:45:53 New Cassandra host /172.31.33.32:9042 added
Connected to cluster: esentriDSEcluster
Datatacenter: Cassandra; Host: /54.xxx.197.xxx; Rack: rack1
INFO  13:45:53 New Cassandra host /172.31.33.35:9042 added
Datatacenter: Analytics; Host: /172.31.33.35; Rack: rack1
Datatacenter: Solr; Host: /172.31.33.34; Rack: rack1
Datatacenter: Cassandra; Host: /172.31.33.33; Rack: rack1
Datatacenter: Cassandra; Host: /172.31.33.32; Rack: rack1
INFO  13:45:53 New Cassandra host /172.31.33.34:9042 added
INFO  13:45:53 New Cassandra host /54.171.197.133:9042 added
ERROR 13:45:58 Error creating pool to /172.31.33.33:9042
com.datastax.driver.core.TransportException: [/172.31.33.33:9042] Cannot connect
    at com.datastax.driver.core.Connection.<init>(Connection.java:104) ~[cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.PooledConnection.<init>(PooledConnection.java:28) ~[cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.Connection$Factory.open(Connection.java:458) ~[cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.HostConnectionPool.<init>(HostConnectionPool.java:85) ~[cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.SessionManager.replacePool(SessionManager.java:241) ~[cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.SessionManager.access$400(SessionManager.java:42) ~[cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.SessionManager$3.call(SessionManager.java:273) [cassandra-driver-core-2.0.5.jar:na]
    at com.datastax.driver.core.SessionManager$3.call(SessionManager.java:265) [cassandra-driver-core-2.0.5.jar:na]
    at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
    at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
Caused by: org.jboss.netty.channel.ConnectTimeoutException: connection timed out: /172.31.33.33:9042
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:137) ~[netty-3.9.0.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83) ~[netty-3.9.0.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) ~[netty-3.9.0.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) ~[netty-3.9.0.Final.jar:na]
    ... 3 common frames omitted

The problem seems to be that the contacted EC2 instance returns the private IPs of the other four instances. How to change that? I think it should return the public IPs.

The cassandra.yaml configuration on the EC2 nodes looks like that:

seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          - seeds: "172.31.33.33,172.31.33.35,172.31.33.34"

listen_address: 172.31.33.32  # (different on every node)

# broadcast_address: 

start_rpc: true
rpc_address: 0.0.0.0
rpc_port: 9160

Changing seeds or specifying a broadcast_address breaks the cluster.

Thanks for help in advance.

Update 1

I changed the rpc_address on every node by setting it's value to the node's public IP address. Then restarted each node.

sudo service dse stop
sudo service dse start

But this breaks the cluster. Short time after DSE startup, a node will shut down again.

From /var/log/cassandra/system.log

 INFO [main] 2015-02-06 09:54:06,128 CassandraDaemon.java (line 135) Logging initialized
 INFO [main] 2015-02-06 09:54:06,156 DseDaemon.java (line 382) DSE version: 4.6.0
 INFO [main] 2015-02-06 09:54:06,157 DseDaemon.java (line 383) Hadoop version: 1.0.4.13
 INFO [main] 2015-02-06 09:54:06,157 DseDaemon.java (line 384) Hive version: 0.12.0.5
 INFO [main] 2015-02-06 09:54:06,158 DseDaemon.java (line 385) Pig version: 0.10.1
 INFO [main] 2015-02-06 09:54:06,159 DseDaemon.java (line 386) Solr version: 4.6.0.3.3
 INFO [main] 2015-02-06 09:54:06,159 DseDaemon.java (line 387) Sqoop version: 1.4.4.14.2
 INFO [main] 2015-02-06 09:54:06,160 DseDaemon.java (line 388) Mahout version: 0.8
 INFO [main] 2015-02-06 09:54:06,160 DseDaemon.java (line 389) Appender version: 3.1.0
 INFO [main] 2015-02-06 09:54:06,161 DseDaemon.java (line 390) Spark version: 1.1.0.2
 INFO [main] 2015-02-06 09:54:06,161 DseDaemon.java (line 391) Shark version: 1.1.0
 INFO [main] 2015-02-06 09:54:06,416 DseConfig.java (line 345) Loading settings from file:/etc/dse/dse.yaml
 INFO [main] 2015-02-06 09:54:06,605 DseConfig.java (line 385) Load of settings is done.
 INFO [main] 2015-02-06 09:54:06,611 DseConfig.java (line 409) CQL slow log is enabled
 INFO [main] 2015-02-06 09:54:06,611 DseConfig.java (line 410) CQL system info tables are not enabled
 INFO [main] 2015-02-06 09:54:06,611 DseConfig.java (line 411) Resource level latency tracking is not enabled
 INFO [main] 2015-02-06 09:54:06,613 DseConfig.java (line 412) Database summary stats are not enabled
 INFO [main] 2015-02-06 09:54:06,614 DseConfig.java (line 413) Cluster summary stats are not enabled
 INFO [main] 2015-02-06 09:54:06,614 DseConfig.java (line 414) Histogram data tables are not enabled
 INFO [main] 2015-02-06 09:54:06,614 DseConfig.java (line 415) User level latency tracking is not enabled
 INFO [main] 2015-02-06 09:54:06,614 DseConfig.java (line 416) Solr latency snapshots are not enabled
 INFO [main] 2015-02-06 09:54:06,615 DseConfig.java (line 417) Solr slow sub-query log is not enabled
 INFO [main] 2015-02-06 09:54:06,615 DseConfig.java (line 418) Solr indexing error log is not enabled
 INFO [main] 2015-02-06 09:54:06,615 DseConfig.java (line 419) Solr update handler metrics are not enabled
 INFO [main] 2015-02-06 09:54:06,615 DseConfig.java (line 420) Solr request handler metrics are not enabled
 INFO [main] 2015-02-06 09:54:06,615 DseConfig.java (line 421) Solr index statistics reporting is not enabled
 INFO [main] 2015-02-06 09:54:06,616 DseConfig.java (line 422) Solr cache statistics reporting is not enabled
 INFO [main] 2015-02-06 09:54:06,629 YamlConfigurationLoader.java (line 80) Loading settings from file:/etc/dse/cassandra/cassandra.yaml
 INFO [main] 2015-02-06 09:54:06,683 DatabaseDescriptor.java (line 143) Data files directories: [/raid0/cassandra/data]
 INFO [main] 2015-02-06 09:54:06,683 DatabaseDescriptor.java (line 144) Commit log directory: /raid0/cassandra/commitlog
 INFO [main] 2015-02-06 09:54:06,683 DatabaseDescriptor.java (line 184) DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
 INFO [main] 2015-02-06 09:54:06,683 DatabaseDescriptor.java (line 198) disk_failure_policy is stop
 INFO [main] 2015-02-06 09:54:06,684 DatabaseDescriptor.java (line 199) commit_failure_policy is stop
 INFO [main] 2015-02-06 09:54:06,690 DatabaseDescriptor.java (line 269) Global memtable threshold is enabled at 930MB
 INFO [main] 2015-02-06 09:54:06,692 Workload.java (line 80) Setting my workload to Cassandra
 INFO [main] 2015-02-06 09:54:06,693 DseDelegateSnitch.java (line 43) Initialized DseDelegateSnitch with workload Cassandra, delegating to com.datastax.bdp.snitch.DseSimpleSnitch
 INFO [main] 2015-02-06 09:54:06,830 DatabaseDescriptor.java (line 408) Not using multi-threaded compaction
 INFO [main] 2015-02-06 09:54:07,075 YamlConfigurationLoader.java (line 80) Loading settings from file:/etc/dse/cassandra/cassandra.yaml
 INFO [main] 2015-02-06 09:54:07,090 YamlConfigurationLoader.java (line 80) Loading settings from file:/etc/dse/cassandra/cassandra.yaml
 INFO [main] 2015-02-06 09:54:07,150 PerformanceObjectsController.java (line 321) ClusterSummaryStats plugin using 4 async writers
 INFO [main] 2015-02-06 09:54:07,150 PerformanceObjectsController.java (line 355) ClusterSummaryStats refresh rate set to 10000 (was 0)
 INFO [main] 2015-02-06 09:54:07,151 PerformanceObjectsController.java (line 321) CqlSystemInfo plugin using 1 async writers
 INFO [main] 2015-02-06 09:54:07,151 PerformanceObjectsController.java (line 355) CqlSystemInfo refresh rate set to 10000 (was 0)
 INFO [main] 2015-02-06 09:54:07,152 PerformanceObjectsController.java (line 321) DbSummaryStats plugin using 4 async writers
 INFO [main] 2015-02-06 09:54:07,152 PerformanceObjectsController.java (line 355) DbSummaryStats refresh rate set to 10000 (was 0)
 INFO [main] 2015-02-06 09:54:07,152 PerformanceObjectsController.java (line 321) HistogramDataTables plugin using 4 async writers
 INFO [main] 2015-02-06 09:54:07,153 PerformanceObjectsController.java (line 355) HistogramDataTables refresh rate set to 10000 (was 0)
 INFO [main] 2015-02-06 09:54:07,153 PerformanceObjectsController.java (line 321) ResourceLatencyTracking plugin using 4 async writers
 INFO [main] 2015-02-06 09:54:07,153 PerformanceObjectsController.java (line 355) ResourceLatencyTracking refresh rate set to 10000 (was 0)
 INFO [main] 2015-02-06 09:54:07,154 PerformanceObjectsController.java (line 321) UserLatencyTracking plugin using 1 async writers
 INFO [main] 2015-02-06 09:54:07,154 PerformanceObjectsController.java (line 355) UserLatencyTracking refresh rate set to 10000 (was 0)
(…) 
 INFO [main] 2015-02-06 09:54:14,717 StorageService.java (line 514) Cassandra version: 2.0.11.83
 INFO [main] 2015-02-06 09:54:14,718 StorageService.java (line 515) Thrift API version: 19.39.0
 INFO [main] 2015-02-06 09:54:14,722 StorageService.java (line 516) CQL supported versions: 2.0.0,3.1.7 (default: 3.1.7)
 INFO [main] 2015-02-06 09:54:14,745 StorageService.java (line 539) Loading persisted ring state
 INFO [main] 2015-02-06 09:54:14,758 StorageService.java (line 677) Starting up server gossip
(…)
 INFO [main] 2015-02-06 09:54:14,885 MessagingService.java (line 473) Starting Messaging Service on port 7000
 INFO [main] 2015-02-06 09:54:14,917 YamlConfigurationLoader.java (line 80) Loading settings from file:/etc/dse/cassandra/cassandra.yaml
(…)
 INFO [main] 2015-02-06 09:54:14,977 StorageService.java (line 1521) Node /172.31.33.33 state jump to normal
 INFO [main] 2015-02-06 09:54:14,987 CassandraDaemon.java (line 543) Waiting for gossip to settle before accepting client requests...
 INFO [main] 2015-02-06 09:54:22,988 CassandraDaemon.java (line 575) No gossip backlog; proceeding
 INFO [main] 2015-02-06 09:54:23,011 AuditLogger.java (line 32) Audit logging is disabled
 INFO [main] 2015-02-06 09:54:23,031 EndpointStatePersister.java (line 56) EndpointStatePersister started
 WARN [main] 2015-02-06 09:54:23,031 Workload.java (line 100) Couldn't determine workload for /172.31.33.35 from value NULL
 WARN [main] 2015-02-06 09:54:23,036 Workload.java (line 100) Couldn't determine workload for /172.31.33.32 from value NULL
 WARN [main] 2015-02-06 09:54:23,036 Workload.java (line 100) Couldn't determine workload for /172.31.33.36 from value NULL
 WARN [main] 2015-02-06 09:54:23,037 Workload.java (line 100) Couldn't determine workload for /172.31.33.34 from value NULL
 INFO [main] 2015-02-06 09:54:23,038 EndpointStateTracker.java (line 80) EndpointStateTracker started
 INFO [main] 2015-02-06 09:54:23,041 DseDaemon.java (line 441) Waiting for other nodes to become alive...
 WARN [main] 2015-02-06 09:54:33,827 DseDaemon.java (line 444) The following nodes seems to be down: [/172.31.33.35, /172.31.33.32, /172.31.33.36, /172.31.33.34]. Some Cassandra operations may fail with UnavailableException.
 INFO [main] 2015-02-06 09:54:33,827 DseDaemon.java (line 454) Wait for nodes completed
 INFO [main] 2015-02-06 09:54:33,836 PluginManager.java (line 262) Activating plugin: com.datastax.bdp.plugin.DseSystemPlugin
 INFO [main] 2015-02-06 09:54:33,837 PluginManager.java (line 344) Plugin activated: com.datastax.bdp.plugin.DseSystemPlugin
 INFO [main] 2015-02-06 09:54:33,837 PluginManager.java (line 262) Activating plugin: com.datastax.bdp.leases.PeriodicTaskOwnershipPlugin
 INFO [main] 2015-02-06 09:54:33,842 PluginManager.java (line 344) Plugin activated: com.datastax.bdp.leases.PeriodicTaskOwnershipPlugin
 INFO [main] 2015-02-06 09:54:33,919 Server.java (line 156) Starting listening for CQL clients on /54.xxx.180.xxx:9042...
ERROR [main] 2015-02-06 09:54:33,934 DseDaemon.java (line 492) Unable to start DSE server.
org.jboss.netty.channel.ChannelException: **Failed to bind to**: /**54.xxx.180.xxx:9042**
    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
    at org.apache.cassandra.transport.Server.run(Server.java:157)
    at org.apache.cassandra.transport.Server.start(Server.java:108)
    at org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:443)
    at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:486)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:509)
    at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:659)
Caused by: java.net.BindException: Cannot assign requested address
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:444)
    at sun.nio.ch.Net.bind(Net.java:436)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
    at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
 INFO [Thread-2] 2015-02-06 09:54:33,937 DseDaemon.java (line 505) DSE shutting down...
 INFO [StorageServiceShutdownHook] 2015-02-06 09:54:33,941 Gossiper.java (line 1307) Announcing shutdown
 INFO [Thread-2] 2015-02-06 09:54:33,954 PluginManager.java (line 304) Deactivating plugin: com.datastax.bdp.leases.PeriodicTaskOwnershipPlugin
 INFO [Thread-2] 2015-02-06 09:54:33,954 PluginManager.java (line 304) Deactivating plugin: com.datastax.bdp.plugin.DseSystemPlugin
 INFO [Thread-2] 2015-02-06 09:54:33,954 PluginManager.java (line 356) All plugins are stopped.
 INFO [Thread-2] 2015-02-06 09:54:33,954 CassandraDaemon.java (line 463) Cassandra shutting down...
 INFO [StorageServiceShutdownHook] 2015-02-06 09:54:35,945 MessagingService.java (line 701) Waiting for messaging service to quiesce
 INFO [ACCEPT-/172.31.33.33] 2015-02-06 09:54:35,945 MessagingService.java (line 941) MessagingService has terminated the accept() thread

回答1:

I believe stress acts like any of the DataStax drivers and uses the RPC address for each node to communicate. Currently you have 0.0.0.0 configured which may be why its picking your internal aws address.

Three addresses avaliable in Cassandra.yaml

  1. Listen Address - This is the ip address other Cassandra nodes will use to talk to this node. You want this to be your internal AWS IP Address for performance.

  2. RPC Address - This is the address your client connects to, probably the one you want to configure to match your external AWS address if your client is not sitting in AWS or in the same AWS region. Also applies for stress.

  3. Broadcast Address - If you are using multiple data centers or AWS Regions, where not all the nodes have access to each other via internal IP. You can specify the external IP address for the nodes in different data centers can still talk to each other. In many cases you don't need this setting at all, it will default to your Listen Address.

Let me know if this helps.