I did some timing tests and also read some articles like this one (last comment), and it looks like in Release build, float and double values take the same amount of processing time.
How is this possible? When float is less precise and smaller compared to double values, how can the CLR get doubles into the same processing time?
It depends on 32-bit or 64-bit system. If you compile to 64-bit, double will be faster. Compiled to 32-bit on 64-bit (machine and OS) made float around 30% faster:
I had a small project where I used CUDA and I can remember that float was faster than double there, too. For once the traffic between Host and Device is lower (Host is the CPU and the "normal" RAM and Device is the GPU and the corresponding RAM there). But even if the data resides on the Device all the time it's slower. I think I read somewhere that this has changed recently or is supposed to change with the next generation, but I'm not sure.
So it seems that the GPU simply can't handle double precision natively in those cases, which would also explain why GLFloat is usually used rather than GLDouble.
(As I said it's only as far as I can remember, just stumbled upon this while searching for float vs. double on a CPU.)
On x86 processors, at least,
float
anddouble
will each be converted to a 10-byte real by the FPU for processing. The FPU doesn't have separate processing units for the different floating-point types it supports.The age-old advice that
float
is faster thandouble
applied 100 years ago when most CPUs didn't have built-in FPUs (and few people had separate FPU chips), so most floating-point manipulation was done in software. On these machines (which were powered by steam generated by the lava pits), it was faster to usefloat
s. Now the only real benefit tofloat
s is that they take up less space (which only matters if you have millions of them).There are still some cases where floats are preferred however - with OpenGL coding for example it's far more common to use the GLFloat datatype (generally mapped directly to 16 bit float) as it is more efficient on most GPUs than GLDouble.