Using FPU return values in c++ code

2019-03-06 19:03发布

问题:

I have an x86 NASM program which seems to work perfectly. I have problems using the values returned from it. This is 32-Bit Windows using MSVC++. I expect the return value in ST0.

A minimal example demonstrating the problem with the returned values can be seen in this C++ and NASM assembly code:

#include <iostream>
extern "C" float arsinh(float);

int main()
{
    float test = arsinh(5.0);
    printf("%f\n", test);                    
    printf("%f\n", arsinh(5.0));             
    std::cout << test << std::endl;          
    std::cout << arsinh(5.0) << std::endl;   
}

Assembly code:

section .data
value: dq 1.0
section .text
global _arsinh
_arsinh:
    fld dword[esi-8]      ;loads the given value into st0
    ret

I can't figure out how to use the return value though, as I always get the wrong value no matter which data type I use. In this example the value 5 should be returned and I'd expect output like:

5.000000

5.000000

5

5

Instead I get output similar to:

-9671494178951383518019584.000000

-9671494178951383518019584.000000

-9.67149e+24

5

Only the final value appears to be correct. What is wrong with this code? Why doesn't it always return the floating point value I am expecting from my function? How can I fix this code?

回答1:

The primary issue is not that there is a failure returning a value in floating point register ST0, but in the way you attempt to load the 32-bit (single precision) float parameter from the stack. The issue is here:

fld dword[esi-8]      ;loads the given value into st0

This should read:

fld dword[esp+4]      ;loads the DWORD parameter from stack into st0

fld dword[esi-8] only works sometimes because of the way the calling function uses ESI internally. With different C compilers and optimizations enabled you may find the code fails to work altogether.

With 32-bit C/C++ code parameters are passed on the stack from right to left. When you do a CALL instruction in 32-bit code the 4 byte return address is placed on the stack. Memory address esp+0 would contain the return address and the first parameter would be at esp+4. If you had a second parameter it would be at esp+8. A good description of the Microsoft 32-bit CDECL calling convention can be found in this WikiBook entry. Of importance:

  • Function arguments are passed on the stack, in right-to-left order.
  • Function result is stored in EAX/AX/AL
  • Floating point return values will be returned in ST0
  • 8-bit and 16-bit integer arguments are promoted to 32-bit arguments.

When dealing with x87 FPU instructions it is very important that the only value on the stack when returning a FLOAT is the value in ST0. Failure to release(popping/freeing) anything else you put on the FPU stack can lead to your function failing when called multiple times. The x87 FPU stack only has 8 slots (not very many). If you don't clean off the FPU stack before the function returns, can lead to FPU stack overflows when future instructions need to load a new value on the FPU stack.

An example implementation of your function could have looked like:

use32
section .text
; _arsinh takes a single float (angle) as a parameter
;     angle is at memory location esp+4 on the stack
;     arcsinh(x) = ln(x + sqrt(x^2+1)) 
global _arsinh
_arsinh:
    fldln2           ; st(0) = ln2
    fld dword[esp+4] ; st(0) = angle, st(1)=ln2
    fld st0          ; st(0) = angle, st(1) = angle, st(2)=ln2
    fmul st0         ; st(0) = angle^2, st(1) = angle, st(2)=ln2
    fld1             ; st(0) = 1, st(1) = angle^2, st(2) = angle, st(3)=ln2
    faddp            ; st(0) = 1 + angle^2, st(1) = angle, st(2)=ln2
    fsqrt            ; st(0) = sqrt(1 + angle^2), st(1) = angle, st(2)=ln2
    faddp            ; st(0) = sqrt(1 + angle^2) + angle, st(1)=ln2
    fyl2x            ; st(0) = log2(sqrt(1 + angle^2) + angle)*ln2
                     ; st(0) = asinh(angle)
    ret