Why are two calls to string.charCodeAt() faster th

2019-05-10 21:26发布

问题:

I discovered a weird behavior in nodejs/chrome/v8. It seems this code:

var x = str.charCodeAt(5);
x = str.charCodeAt(5);

is faster than this

var x = str.charCodeAt(5); // x is not greater than 170
if (x > 170) {
  x = str.charCodeAt(5);
}

At first I though maybe the comparison is more expensive than the actual second call, but when the content inside the if block is not calling str.charCodeAt(5) the performance is the same as with a single call.

Why is this? My best guess is v8 is optimizing/deoptimizing something, but I have no idea how to exactly figure this out or how to prevent this from happening.

Here is the link to jsperf that demonstrates this behavior pretty well at least on my machine: https://jsperf.com/charcodeat-single-vs-ifstatment/1


Background: The reason i discovered this because I tried to optimize the token reading inside of babel-parser.

I tested and str.charCodeAt() is double as fast as str.codePointAt() so I though I can replace this code:

var x = str.codePointAt(index);

with

var x = str.charCodeAt(index);
if (x >= 0xaa) {
  x = str.codePointAt(index);
}

But the second code does not give me any performance advantage because of above behavior.

回答1:

V8 developer here. As Bergi points out: don't use microbenchmarks to inform such decisions, because they will mislead you.

Seeing a result of hundreds of millions of operations per second usually means that the optimizing compiler was able to eliminate all your code, and you're measuring empty loops. You'll have to look at generated machine code to see if that's what's happening.

When I copy the four snippets into a small stand-alone file for local investigation, I see vastly different performance results. Which of the two are closer to your real-world use case? No idea. And that kind of makes any further analysis of what's happening here meaningless.

As a general rule of thumb, branches are slower than straight-line code (on all CPUs, and with all programming languages). So (dead code elimination and other microbenchmarking pitfalls aside) I wouldn't be surprised if the "twice" case actually were faster than either of the two "if" cases. That said, calling String.charCodeAt could well be heavyweight enough to offset this effect.