I am inserting time series data with time stamp (T) as the column name in a wide column that stores 24 hours worth of data in a single row. Streaming data is written from data generator (4 instances, each with 256 threads) inserting data into multiple rows in parallel.
CF2 (Wide column family):
RowKey1 (T1, V1) (T2, V3) (T4, V4) ......
RowKey2 (T1, V1) (T3, V3) .....
:
:
I am now attempting to read this data off Cassandra using multiget. The client is written in python and uses pycassa.
When I query 24 hours worth data for a single row key, over 2 million data points are returned. At 4 rowkeys in parallel using multiget, I get the following error:
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\columnfamily.py", line 772, in multiget packed_keys[offset:offset + buffer_size], cp, sp, consistency)
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 576, in execute return getattr(conn, f)(*args, **kwargs)
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 153, in new_f return new_f(self, *args, **kwargs)
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 153, in new_f return new_f(self, *args, **kwargs)
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 153, in new_f return new_f(self, *args, **kwargs) File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 153, in new_f return new_f(self, *args, **kwargs)
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 153, in new_f return new_f(self, *args, **kwargs)
File "c:\Python27-64bit\lib\site-packages\pycassa-1.9.1-py2.7.egg\pycassa\pool.py", line 148, in new_f (self._retry_count, exc.class._name_, exc)) pycassa.pool.MaximumRetryException: Retried 6 times. Last failure was timeout: timed out
However, previously I was able to get data with 256 rowkeys in parallel. Now, I have increased the density of data (i.e no .of data points within a time range) and the queries fail with this issue.
We tweaked the buffer_size in multiget and found 8 to be the sweet spot.
HeapSize: 6GB
RAM: 16GB
Read Consistency: ONE
Cluster: 5 node
The CPU utilization is less than 20%. Also, I do not see any abnormal patterns in Read throughput, read latency, disk throughput & disk latency as reported by OpsCenter.
I also tried increasing the read_request_timeout_in_ms in Cassandra.yaml to 60000 but in vain. Any other pointers as to why we are getting this exception? I would expect the query to take a lot of time to retrieve the results, but nonetheless not fail.
Thanks,
VS