What is the best way to iterate over a large resul

2019-06-06 01:22发布

问题:

We're trying to iterate over a large number of rows from the database and convert those into objects. Behavior will be as follows:

  • Result will be sorted by sequence id, a new object will be created when sequence id changes. The object created will be sent to an external service and will sometimes have to wait before sending another one (which means the next set of data will not be immediately used)
  • We already have invested code in iBatis 3 so an iBatis solution will be the best approach for us (we've tried using RowBounds but haven't seen how it does the iteration under the hood).
  • We'd like to balance minimizing memory usage and reducing number of DB trips.
  • We're also open to pure JDBC approach but we'd like the solution to work on different databases.

UPDATE 1:

  • We need to make as few calls to DB as possible (1 call would be the ideal scenario) while also preventing the application to use too much memory. Are there any other solutions out there for this type of problem may it be pure JDBC or any other technology?

UPDATE 2

  • Query will be backend driven and will only have 1 instance executing at a given time.

Thanks and hope to hear your insights on this.

回答1:

It seems you need some sort of pagination. iBatis does that through the standard LIMIT/OFFSET parameters in the query (and RowBounds in iBatis 3 ).

But it seems (if I get it right) that you also are using the GROUP BY feature of iBatis, so that a select returning N records with N1 distint "idx" fields result in the creation of N1 "parent" objects each one having several children objects (with a total of N children objects created). Or something like that.

Unfortunately (and understandably) both things do not mix well.

I dont' see any silver bullet here, one can think of many approaches, each has its shortcomings - hard to evaluate without more information.

If the main objects are "big" (many records) and each one will be processed individually (with a trip to a remote server) you might even want to do an ad-hoc pagination, with a object per page, remembering internally the previosuly read id (something like SELECT ... FROM ... WHERE id = (SELECT MIN(id) FROM .... WHERE id > #lastid# ) )



回答2:

We need to make as few calls to DB as possible (1 call would be the ideal scenario) while also preventing the application to use too much memory. Are there any other solutions out there for this type of problem may it be pure JDBC or any other technology?

You should really not worry about the amount of DB calls. Just query exactly the data the enduser needs to see at once. It can't be done more efficiently. Google also doesn't query the entire database to show only the first 10. No, it queries exactly those 10 for display, nothing less or more. This is much, much faster and more efficient than hauling/duplicating the entire database into application's memory and working on that. Take benefit of the powers of the RDBMS. That's what it was invented/intended for.



标签: java jdbc ibatis