I need a solution for getting the whole table from one DB instance (DB1) and creation the same on another DB instance (DB2). Earlier I used Spring Integration tightly, but I heard that Spring Batch can fit better for such a case and I would like to try it.
So, is it possible/have a sence to use Spring Batch job with the following steps:
- Create an empty table on DB2 having the same schema as a source table from DB1.
- Select from DB1 table -> update DB2 table. In case smth wrong during the step - rollback and delete table on DB2.
Using Spring Integration I see possible solution with JdbcInboundChannelAdapter -> QueueChannel -> OutboundGateway
, but maybe Spring Batch can be a better choice? Any suggestions much appreciated.
It depends on the requirements, which you have.
We used spring-batch to load the DWH initially. But this was not a simple one to one copy. The data in the production DB were stored in a vertical table design (aka entity-attribute-value-model). Moreover, each entry was also bitemporal historized. All in all, these were around 5'000'000'000 rows that had to be grouped and converted to a horizontal database design with respect to the correct time dimensions. And of course, this was from DB2 to Oracle.
So we had to convert the structure significantly, we had to convert the datatypes if necessary, and we had 5 billion rows to process.
If you just have a plain 1:1 copy within the same DB system, then the simplest approach would be to use DB tools.
If you have to transfer between different DB systems, spring batch could be an option.
If you have to convert data, either structure or types, spring batch could be an option.
If you have a lot of rows, you could use partitioning features of spring batch to parallelise or simply start the same job with different id-ranges a couple of time.
If you have several tables to copy, you could use a splitflow to load the data from the various tables in parallel.
At the company I'm working with at the moment, we have completely generic File/Db reader and writer, which are solely based on the DB metamodel. If I want to do a simple copy job, I just have to define the source and the target datasource together with the desired tables and initialize the readers and writers appropriately. If the names match, I don't even have to add configuration to map the data between the tables.
This leads us to the following basic questions
I have solved my task using configuration described in my other post: How should I use .tasklet() / .chunk() to finish job succesfully?. Thanks to all!