Spring RESTful web services - High volume data pro

2019-04-01 23:46发布

问题:

I'm trying to build a Spring/Spring Boot- RESTful web service,

  1. Which accepts a CSV file with 1 million rows/40 columns in each row as input (From a Angular based front end) and will be a synchronous call. User have to be notified on the upload status before proceeding to other screens. So, wait time can't be more than few mins(say 5 mins).

  2. Each of these rows has to be validated against what is in DB and if found to be valid, same will be inserted into DB. In short, each row can be a separate independent transaction.

Can you please suggest what would be the best approach to implement this?

Current legacy system implements the same functionality in Stored procedures, which made the solution closely coupled with the DB, which will be an issue if the RDBMS needs to be changed down the line.

  1. Any approach on processing these 1 million data in chunks (say 20k) in asynchronous web service calls?

  2. Spring batch?

  3. Can stored procedure be more suitable and better performing than above two options by any chance(guessing no!) ?

Can you please help with some approach which is at least as good as stored procedure and how to horizontally scale based on the suggested solution.

回答1:

You are on the right track with your three suggested options. The answer to your question is unfortunately, it depends.

Any of the above approaches could work for you. I personally prefer Spring Batch as I find the programming model simple and intuitive.

Spring Batch Processing Guide


Another approach would be to use Messaging to parallelize the processing of rows:

  1. Controller receives CSV file containing large amount of data
  2. Split the data into smaller pieces and send to a temporary Message Queue
  3. Multiple Worker Nodes receive Messages and process them
  4. Monitor the size of the temporary Queue and update the user accordingly (% completed)

In short, your knowledge of your own domain will ultimately guide you towards the best solution for your business.