Using GCC 6.3, the following C++ code:
#include <cmath>
#include <iostream>
void norm(double r, double i)
{
double n = std::sqrt(r * r + i * i);
std::cout << "norm = " << n;
}
generates the following x86-64 assembly:
norm(double, double):
mulsd %xmm1, %xmm1
subq $24, %rsp
mulsd %xmm0, %xmm0
addsd %xmm1, %xmm0
pxor %xmm1, %xmm1
ucomisd %xmm0, %xmm1
sqrtsd %xmm0, %xmm2
movsd %xmm2, 8(%rsp)
jbe .L2
call sqrt
.L2:
movl std::cout, %edi
movl $7, %edx
movl $.LC1, %esi
call std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
movsd 8(%rsp), %xmm0
movl std::cout, %edi
addq $24, %rsp
jmp std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
For the call to std::sqrt
, GCC first does it using sqrtsd
and saves the result on to the stack. If it overflows, it then calls the libc sqrt
function. But it never saves the xmm0
after that and before the second call to operator<<
, it restores the value from the stack (because xmm0
was lost with the first call to operator<<
).
With a simpler std::cout << n;
, it's even more obvious:
subq $24, %rsp
movsd %xmm1, 8(%rsp)
call sqrt
movsd 8(%rsp), %xmm1
movl std::cout, %edi
addq $24, %rsp
movapd %xmm1, %xmm0
jmp std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
Why is GCC not using the xmm0
value computed by libc sqrt
?