Working with arrays in V8 (performance issue)

2019-03-08 11:04发布

问题:

I tried next code (it shows similar results in Google Chrome and nodejs):

var t = new Array(200000); console.time('wtf'); for (var i = 0; i < 200000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 27839.499ms
undefined

I also runned next tests:

var t = []; console.time('wtf'); for (var i = 0; i < 400000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 449.948ms
undefined
var t = []; console.time('wtf'); for (var i = 0; i < 400000; ++i) {t.push(undefined);} console.timeEnd('wtf');
wtf: 406.710ms
undefined

But in Firefox all looks fine with the first variant:

>>> var t = new Array(200000); console.time('wtf'); ...{t.push(Math.random());} console.timeEnd('wtf');
wtf: 602ms

What happens in V8?

UPD * magically decreasing performance *

var t = new Array(99999); console.time('wtf'); for (var i = 0; i < 200000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 220.936ms
undefined
var t = new Array(100000); t[99999] = 1; console.time('wtf'); for (var i = 0; i < 200000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 1731.641ms
undefined
var t = new Array(100001); console.time('wtf'); for (var i = 0; i < 200000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 1703.336ms
undefined
var t = new Array(180000); console.time('wtf'); for (var i = 0; i < 200000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 1725.107ms
undefined
var t = new Array(181000); console.time('wtf'); for (var i = 0; i < 200000; ++i) {t.push(Math.random());} console.timeEnd('wtf');
wtf: 27587.669ms
undefined

回答1:

If you preallocate, do not use .push because you will create a sparse array backed by a hashtable. You can preallocate sparse arrays up to 99999 elements which will be backed by a C array, after that it's a hashtable.

With the second array you are adding elements in a contiguous way starting from 0, so it will be backed by a real C array, not a hash table.

So roughly:

If your array indices go nicely from 0 to Length-1, with no holes, then it can be represented by a fast C array. If you have holes in your array, then it will be represented by a much slower hash table. The exception is that if you preallocate an array of size < 100000, then you can have holes in the array and still get backed by a C array:

var a = new Array(N); 

//If N < 100000, this will not make the array a hashtable:
a[50000] = "sparse";

var b = [] //Or new Array(N), with N >= 100000
//B will be backed by hash table
b[50000] = "Sparse";
//b.push("Sparse"), roughly same as above if you used new Array with N > 0


回答2:

As you probably already know, if you pre-allocate an array with > 10000 elements in Chrome or Node (or more generally, in V8), they fall back to dictionary mode, making things uber-slow.

Thanks to some of the comments in this thread, I was able to track things down to object.h's kInitialMaxFastElementArray.

I then used that information to file an issue in the v8 repository which is now starting to gain some traction, but it will still take a while. And I quote:

I hope we'll be able to do this work eventually. But it's still probably a ways away.