This question already has answers here:
Closed 2 years ago.
I am planning on running a large number of queries on firebase which could grow to the order of a few hundred thousand to even millions. I've been using Promise.all()
to resolve most of my queries but as the requests grow Promise.all()
seems to just stop running at a random number.Ive looked into using Promise.map()
but Im not sure if the concurrency will solve the problem. Thank you for your help.
Below is a simplified example as you can see it appears to just time out without throwing any error:
var promises = [];
var i = 0;
for(var x = 0; x<1000000; x++){
const promise = new Promise((resolve, reject) => {
setTimeout(() => {
i += 1;
resolve(i);
}, 10);
});
promises.push(promise);
}
Promise.all(promises).then((value) => {
console.log(value)
}).catch((error) => {
console.log("Error making js node work:" + error);
})
When I need to do something like this, I usually divide the queries into batches. The batches run one-by-one, but the queries in each batch run in parallel. Here's what that might look like.
const _ = require('lodash');
async function runAllQueries(queries) {
const batches = _.chunk(queries, BATCH_SIZE);
const results = [];
while (batches.length) {
const batch = batches.shift();
const result = await Promises.all(batch.map(runQuery));
results.push(result)
}
return _.flatten(results);
}
What you see here is similar to a map-reduce. That said, if you're running a large number of queries in a single node (e.g., a single process or virtual machine), you might consider distributing the queries across multiple nodes. If the number of queries is very large and the order in which the queries are processed is not important, this is probably a no-brainer. You should also be sure that the downstream system (i.e., the one you're querying) can handle the load you throw at it.