I am concerned about the following cases
min(-0.0,0.0)
max(-0.0,0.0)
minmag(-x,x)
maxmag(-x,x)
According to Wikipedia IEEE 754-2008 says in regards to min and max
The min and max operations are defined but leave some leeway for the case where the inputs are equal in value but differ in representation. In particular:
min(+0,−0) or min(−0,+0) must produce something with a value of zero but may always return the first argument.
I did some tests compare fmin
, fmax
, min and max as defined below
#define max(a,b) \
({ __typeof__ (a) _a = (a); \
__typeof__ (b) _b = (b); \
_a > _b ? _a : _b; })
#define min(a,b) \
({ __typeof__ (a) _a = (a); \
__typeof__ (b) _b = (b); \
_a < _b ? _a : _b; })
and _mm_min_ps
and _mm_max_ps
which call the SSE minps
and maxps
instruction.
Here are the results (the code I used to test this is posted below)
fmin(-0.0,0.0) = -0.0
fmax(-0.0,0.0) = 0.0
min(-0.0,0.0) = 0.0
max(-0.0,0.0) = 0.0
_mm_min_ps(-0.0,0.0) = 0.0
_mm_max_ps(-0.0,0.0) = -0.0
As you can see each case returns different results. So my main question is what does the C and C++ standard libraries say? Does fmin(-0.0,0.0)
have to equal -0.0
and fmax(-0.0,0.0)
have to equal 0.0
or are different implementations allowed to define it differently? If it's implementation defined does this mean that to insure the code is compatible with different implementation of the C standard library (.e.g from different compilers) that checks must be done to determine how they implement min and max?
What about minmag(-x,x)
and maxmag(-x,x)
? These are both defined in IEEE 754-2008. Are these implementation defined at least in IEEE 754-2008? I infer from Wikepdia's comment on min and max that these are implementation defined. But the C standard library does not define these functions as far as I know. In OpenCL these functions are defined as
maxmag Returns x if | x| > |y|, or y if |y| > |x|, otherwise fmax(x, y).
minmag Returns x if |x| < |y|, or y if |y| < |x|, otherwise fmin(x, y).
The x86 instruction set has no minmag and maxmag instructions so I had to implement them. But in my case I need performance and creating a branch for the case when the magnitudes are equal is not efficient.
The Itaninum instruction set has minmag and maxmag instructions (famin
and famax
) and in this case as far as I can tell (from reading) in this case it returns the second argument. That's not what minps
and maxps
appear to be doing though. It's strange that _mm_min_ps(-0.0,0.0) = 0.0
and _mm_max_ps(-0.0,0.0) = -0.0
. I would have expected them to either return the first argument in both cases or the second. Why are the minps
and maxps
instructions defined this way?
#include <stdio.h>
#include <x86intrin.h>
#include <math.h>
#define max(a,b) \
({ __typeof__ (a) _a = (a); \
__typeof__ (b) _b = (b); \
_a > _b ? _a : _b; })
#define min(a,b) \
({ __typeof__ (a) _a = (a); \
__typeof__ (b) _b = (b); \
_a < _b ? _a : _b; })
int main(void) {
float a[4] = {-0.0, -1.0, -2.0, -3.0};
float b[4] = {0.0, 1.0, 2.0, 3.0};
__m128 a4 = _mm_load_ps(a);
__m128 b4 = _mm_load_ps(b);
__m128 c4 = _mm_min_ps(a4,b4);
__m128 d4 = _mm_max_ps(a4,b4);
{ float c[4]; _mm_store_ps(c,c4); printf("%f %f %f %f\n", c[0], c[1], c[2], c[3]); }
{ float c[4]; _mm_store_ps(c,d4); printf("%f %f %f %f\n", c[0], c[1], c[2], c[3]); }
printf("%f %f %f %f\n", fmin(a[0],b[0]), fmin(a[1],b[1]), fmin(a[2],b[2]), fmin(a[3],b[3]));
printf("%f %f %f %f\n", fmax(a[0],b[0]), fmax(a[1],b[1]), fmax(a[2],b[2]), fmax(a[3],b[3]));
printf("%f %f %f %f\n", min(a[0],b[0]), min(a[1],b[1]), min(a[2],b[2]), min(a[3],b[3]));
printf("%f %f %f %f\n", max(a[0],b[0]), max(a[1],b[1]), max(a[2],b[2]), max(a[3],b[3]));
}
//_mm_min_ps: 0.000000, -1.000000, -2.000000, -3.000000
//_mm_max_ps: -0.000000, 1.000000, 2.000000, 3.000000
//fmin: -0.000000, -1.000000, -2.000000, -3.000000
//fmax: 0.000000, 1.000000, 2.000000, 3.000000
//min: 0.000000, -1.000000, -2.000000, -3.000000
//max: 0.000000, 1.000000, 2.000000, 3.000000
Edit:
In regards to C++ I tested std::min(-0.0,0.0)
and std::max(-0.0,0.0)
and the both return -0.0
. Which shows that that std::min
is not the same as fmin
and std::max
is not the same as fmax
.