How to get Last 6 Month data comparing with timestamp
column using cassandra query?
I need to get all account statement which belongs to last 3/6 months comparing with updatedTime(TimeStamp column)
and CurrentTime
.
For example in SQL we are using DateAdd()
function tor this to get. i dont know how to proceed this in cassandra.
If anyone know,reply.Thanks in Advance.
相关问题
- What version of Java does Cassandra 3 require
- Filter from Cassandra table by RDD values
- cassandra: can you query against a collection fiel
- How to understand the 'Flexible schema' in
- NoHostAvailableException With Cassandra & DataStax
相关文章
- Cassandra Read a negative frame size
- How does cassandra split keyspace data when multip
- How does Cassandra scale horizontally ?
- NoSQL Injection? (PHP->phpcassa->Cassandra)
- Executing CQL through Shell Script?
- Spark and Cassandra Java Application Exception Pro
- How to access the local data of a Cassandra node
- how to connect cassandra from local to EC2 instanc
Cassandra 2.2 and later allows users to define functions (UDT) that can be applied to data stored in a table as part of a query result.
You can create your own method if you use Cassandra 2.2 and later UDF
This method receive two parameter
Return the date timestamp
Here is how you can use this :
Here monthAdd method subtract 1 mont from the current timestamp, So this query will data of last month
Note : By default User-defined-functions are disabled in cassandra.yaml - set enable_user_defined_functions=true to enable if you are aware of the security risks
In cassandra you have to build the queries upfront.
Also be aware that you will probably have to bucket the data depending on the number of accounts that you have within some period of time.
If your whole database doesn't contain more than let's say 100k entries you are fine with just defining a single generic partition let's say with name 'all'. But usually people have a lot of data that simply goes into bucket that carries a name of month, week, hour. This depends on the number of inserts you get.
The reason for creating buckets is that every node can find a partition by it's partition key. This is the first part of the
primary key
definition. Then on every node the data is sorted by the second information that you pass in to theprimary key
. Having the data sorted enables you to "scan" over them i.e. you will be able to retrieve them by giving timestamp parameter.Let's say you want to retrieve accounts from the last 6 months and that you are saving all the accounts from one month in the same bucket.
The schema might be something on the lines of:
Usually you will do this at the application level, merging queries is an anti pattern but is o.k. for smaller amount of queries:
Output:
and so on.
If you have something really simple with let's say expected 100 000 entries then you could use the above schema and just do something like:
Once more as a wrap-up, with cassandra you always have to prepare the structures for the access pattern you are going to use.