-->

Midpoint 'rounding' when dealing with larg

2019-06-24 01:16发布

问题:

So I was trying to understand JavaScript's behavior when dealing with large numbers. Consider the following (tested in Firefox and Chrome):

console.log(9007199254740993) // 9007199254740992
console.log(9007199254740994) // 9007199254740994
console.log(9007199254740995) // 9007199254740996
console.log(9007199254740996) // 9007199254740996
console.log(9007199254740997) // 9007199254740996
console.log(9007199254740998) // 9007199254740998
console.log(9007199254740999) // 9007199254741000

Now, I'm aware of why it's outputting the 'wrong' numbers—it's trying to convert them to floating point representations and it's rounding off to the nearest possible floating point value—but I'm not entirely sure about why it picks these particular numbers. My guess is that it's trying to round to the nearest 'even' number, and since 9007199254740996 is divisible by 4 while 9007199254740994 is not, it considers 9007199254740996 to be more 'even'.

  • What algorithm is it using to determine the internal representation? My guess is that it's an extension of regular midpoint rounding (round to even is the default rounding mode in IEEE 754 functions).
  • Is this behavior specified as part of the ECMAScript standard, or is it implementation dependent?

回答1:

As pointed out by Mark Dickinson in a comment on the question, the ECMA-262 ECMAScript Language Specification requires the use of IEEE 754 64-bit binary floating point to represent the Number Type. The relevant rounding rules are "Choose the member of this set that is closest in value to x. If two values of the set are equally close, then the one with an even significand is chosen...".

These rules are general, applying to rounding results of arithmetic as well as the values of literals.

The following are all the numbers in the relevant range for the question that are exactly representable in IEEE 754 64-bit binary floating point. Each is shown as its decimal value, and also as a hexadecimal representation of its bit pattern. A number with an even significand has an even rightmost hexadecimal digit in its bit pattern.

9007199254740992 bit pattern 0x4340000000000000
9007199254740994 bit pattern 0x4340000000000001
9007199254740996 bit pattern 0x4340000000000002
9007199254740998 bit pattern 0x4340000000000003
9007199254741000 bit pattern 0x4340000000000004

Each of the even inputs is one of these numbers, and rounds to that number. Each of the odd inputs is exactly half way between two of them, and rounds to the one with the even significand. This results in rounding the odd inputs to 9007199254740992, 9007199254740996, and 9007199254741000.



回答2:

Patricia Shanahan's answer helped a lot and explained my primary question. However, to second part of the question—whether or not this behavior is implementation dependent—it turns out that yes it is, but in a slightly different way than I originally thought. Quoting from ECMA-262 5.1 § 7.8.3:

… the rounded value must be the Number value for the MV (as specified in 8.5), unless the literal is a DecimalLiteral and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th significant digit position.

In other words, an implementation may choose to ignore everything after the 20th digit. Consider this:

console.log(9007199254740993.00001)

Both Chrome and Firefox will output 9007199254740994, however, Internet Explorer will output 9007199254740992 because it chooses to ignore the after the 20th digit. Interestingly, this doesn't appear to be standards-compliant behavior (at least as I read this standard). it should interpret this the same as 9007199254740993.0001, but it does not.



回答3:

JavaScript represents numbers as 64-bit floating point values. This is defined in the standard.

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

So there's nothing related with midpoint rounding going on there.

As a hint, every 32 bit integer has an exact representation in double-precision floating format.


Ok, since you're asking for the exact algorithm, I checked how Chrome's V8 engine does it. V8 defines a StringToDouble function, which calls InternalStringToDouble in the following file:

https://github.com/v8/v8/blob/master/src/conversions-inl.h#L415

And this in turn, calls the Strotd function defined there:

https://github.com/v8/v8/blob/master/src/strtod.cc