In SparkStreaming should we off load the saving part to another layer because SparkStreaming context is not available when we use SparkCassandraConnector if our database is cassandra. Moreover, even if we use some other database to save our data then we need to create connection on the worker every time we process a batch of rdds. Reason being connection objects are not serialized.
Is it recommended to create/close connections at workers?
It would make our system tightly coupled with the existing database tomorrow we may change the database
To answer your questions:
Possible duplicate of: Handle database connection inside spark streaming
Read this link, it should clarify some of you questions Design Patterns for using foreachRDD
Hope this help!