I intend to use apache flink for read/write data into cassandra using flink. I was hoping to use flink-connector-cassandra, I don't find good documentation/examples for the connector.
Can you please point me to the right way for read and write data from cassandra using Apache Flink. I see only sink example which are purely for write ? Is apache flink meant for reading data too from cassandra similar to apache spark ?
You can use
RichFlatMapFunction
to extend a classBasically
open
function executes once per worker andflatmap
executes it per record. The example is for mongo but can be similarly used for cassandraIn your case as I understand the first step of your pipeline is reading data from Cassandra rather than writing a
RichFlatMapFunction
you should write your ownRichSourceFunction
As a reference you can have a look at simple implementation of WikipediaEditsSource.
I had the same question, and this is what I was looking for. I don't know if it is over simplified for what you need, but figured I should show it none the less.
The way I figured this out was thanks to finding the code for the "CassandraInputFormat" class and seeing how it worked (http://www.javatips.net/api/flink-master/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java). I honestly expected it to just be a format and not the full class of reading from Cassandra based on the name, and I have a feeling others might be thinking the same thing.