I am finding the way to store the application logs in Cassandra.
I have three node setup (Node 1, Node 2 and Node 3) in which my web application runs as cluster in all the three nodes and load balanced so logs will be generated from all nodes.
Cassandra runs in all the three nodes and logs are dumped from all the three web application into Cassandra cluster which is partitioned for every day.
Problem in this approach :
1) I am using my web application to write the data to Cassandra.
2) For every day partition, the amount of data is very high
So Is there a better approach for this?
Is this the good design approach?
The choice of storing logs in Cassandra is debatable; as the analysis of that data becomes difficult but doable. ELK (Elastic-Logstash-Kibana) or Splunk are more popular choices for log analysis because of their native "text" search support and dashboards.
Having said that, lets look at the problems in hand
The suggestions that come to my mind here are:
Revisit the design with what are all the log analysis questions (queries) that this C* DB would have to answer? Answers should line up automagically.