I'm using pg-promise and I want to make multiple inserts to one table. I've seen some solutions like Multi-row insert with pg-promise and How do I properly insert multiple rows into PG with node-postgres?, and I could use pgp.helpers.concat in order to concatenate multiple selects.
But now, I need to insert a lot of measurements in a table, with more than 10,000 records, and in https://github.com/vitaly-t/pg-promise/wiki/Performance-Boost says: "How many records you can concatenate like this - depends on the size of the records, but I would never go over 10,000 records with this approach. So if you have to insert many more records, you would want to split them into such concatenated batches and then execute them one by one."
I read all the article but I can't figure it out how to "split" my inserts into batches and then execute them one by one.
Thanks!
I think the naive approach would work.
Try to split your data into multiple pieces of 10,000 records or less. I would try splitting the array using the solution from this post.
Then, multi-row insert each array with pg-promise and execute them one by one in a transaction.
Edit : Thanks to @vitaly-t for the wonderful library and for improving my answer.
To do this, use the batch function from pg-promise to resolve all queries asynchronously :
UPDATE
Best is to read the following article: Data Imports.
As the author of pg-promise I was compelled to finally provide the right answer to the question, as the one published earlier didn't really do it justice.
In order to insert massive/infinite number of records, your approach should be based on method sequence, that's available within tasks and transactions.
This is the best approach to inserting massive number of rows into the database, from both performance point of view and load throttling.
All you have to do is implement your function
getData
according to the logic of your app, i.e. where your large data is coming from, based on theindex
of the sequence, to return some 1,000 - 10,000 objects at a time, depending on the size of objects and data availability.See also some API examples:
Related question: node-postgres with massive amount of queries.
And in cases where you need to acquire generated id-s of all the inserted records, you would change the two lines as follows:
and
just be careful, as keeping too many record id-s in memory can create an overload.