Are there any 64-bit unsigned integer values which cannot be represented with a double-precision floating-point type? (As a double is also 64-bit wide there must be some.) If so, how can I calculate all of them? (In a not brute force way, maybe?)
相关问题
- Do the Java Integer and Double objects have unnece
- the application was unable to start correctly 0xc0
- Add or Subtract From 64bit Integer in Javascript
- SQLite3.dll for Windows 7 64 bit
- How do I perform a proper unsigned right shift in
相关文章
- 关于C#中 float、double、decimal 的运算不精确的问题。
- Why windows 64 still makes use of user32.dll etc?
- How can I convert a OLE Automation Date value to a
- Determine if an executable (or library) is 32 -or
- Macro or function to construct a float (double) fr
- React Native Input component takes ony numeric val
- Is it possible to check whether you are building f
- Math.Max vs Enumerable.Max
Every integer from 0 to 2^52 inclusive is representable exactly, from 2^52 to 2^53 only every even integer (lowest significant bit of 0), then every fourth integer, up to 2^64-2^12.
We could generalise with a bit of code,
taking m=52 :
produces :
Example :
Assigning 0x0020000000000000 to a double gives 9007199254740992.0 (0x0x4340000000000000 in IEEE754)
Assigning 0x0020000000000001 to a double gives 9007199254740992.0 (same value)
Assigning 0x0020000000000002 to a double gives 9007199254740994.0 (0x0x4340000000000001 , which is the next representable value)
If a 64-bit number is represented as following:
52 A bits, followed by at least 1 B bit, followed by a single "1" bit.
where A is any bit, and one of the B bits must be non zero, then it cannot be represented as a double. (I am relying on the way bits are used for double, as shown in http://en.wikipedia.org/wiki/Double-precision_floating-point_format)
An IEEE754 double precision value has 53 bits of significand, so any 64-bit unsigned ints which have more than 53 consecutive significant bits (i.e the distance between the first 1 bit to the last 1 bit is more than 53 bits in length) cannot be losslessly converted to double.