x86-64 long double precision

What is the actual precision of long double on Intel 64-bit platforms? is it 80 bits padded to 128 or actual 128 bit?

if former, besides going gmp, is there another option to achieve true 128 precision?

标签： c++ c floating-point precision long-double

5条回答

2楼-- · 2020-07-12 08:00

I would recommend using MPFR. It is a more sophisticated multiple-precision floating point library that is built on top of GMP.

0人赞添加讨论(0) 举报

淡お忘

3楼-- · 2020-07-12 08:06

There is a good chance that it's 64 bit for both (depending on the compiler and OS), because the compiler is emitting scalar SSE2 instead of x87 instructions.

x86 doesn't support higher precision than 80 bits, but if you really need more than 64 bits for a FP algorithm most likely you should check your numerics instead of solving the problem with brute force.

0人赞添加讨论(0) 举报

beautiful°

4楼-- · 2020-07-12 08:07

x86-64 precision is the same as regular x86. Extended double is 80 bits, using the x87 ISA, with 6 padding bytes added. There is no 128-bit FP hardware.

A software implementation of quad or extended quad precision might benefit from the x86-64 64x64 => 128 integer multiply instruction, though.

0人赞添加讨论(0) 举报

神经病院院长

5楼-- · 2020-07-12 08:17

I recommend the Boost wrappers over MPFR or GMP:

Boost 1.70: cpp_bin_float.

As well as arbitrary types to any desired precision, the following types are provided:

cpp_bin_float_single           (24 bits + mantissa = 32 bits)
cpp_bin_float_double           (53 bits + mantissa = 64 bits)
cpp_bin_float_double_extended  (64 bits + mantissa)
cpp_bin_float_quad             (113 bits + mantissa = 128 bits)
cpp_bin_float_oct              (237 bits) + mantissa = 256 bits)

Boost offers almost out-of-the-box functionality. Once compiled, all one needs to do is add a pointer within the Visual Studio project to the include and library directories.

Tested with Visual Studio 2017 + Boost v1.70.

See instructions to compile boost.

0人赞添加讨论(0) 举报

再贱就再见

6楼-- · 2020-07-12 08:21

There are a few of options.

use double-double to represent quad. For example, see http://www.codeproject.com/Articles/884606/The-double-double-type. However, the type does not confirm to IEEE standard. You can tell by inspecting its epsilon value being less accurate than IEEE standard 128-bit float which is 1.926E-34.
use true IEEE standard 128-bit floats. Microsoft VC++ compiler does not provide such type. Intel C++ compiler does provide a type _Quad, although its implementation is not complete (no I/O operations) at this time.
use third party library. I have recently created a library called double128 that is based on Intel C++ _Quad but adds I/O operations. It works with Microsoft VC++. You can visit http://www.cg-inc.com/Product/Double128 for more information.

0人赞添加讨论(0) 举报

x86-64 long double precision

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间