In follow up to this question, it appears that some numbers cannot be represented by floating point at all, and instead are approximated.
How are floating point numbers stored?
Is there a common standard for the different sizes?
What kind of gotchas do I need to watch out for if I use floating point?
Are they cross-language compatible (ie, what conversions do I need to deal with to send a floating point number from a python program to a C program over TCP/IP)?
-Adam
As mentioned, the Wikipedia article on IEEE 754 does a good job of showing how floating point numbers are stored on most systems.
Now, here are some common gotchas:
As to the second part of your question, unless performance and efficiency are important for your project, then I suggest you transfer the floating point data as a string over TCP/IP. This lets you avoid issues such as byte alignment and will ease debugging.
If you're really worried about floating point rounding errors, most languages offer data types that don't have floating point errors. SQL Server has the Decimal and Money data types. .Net has the Decimal data type. They aren't infinite precision like BigDecimal in Java, but they are precise down to the number of decimal points they are defined for. So you don't have to worry about a dollar value you type in as $4.58 getting saved as a floating point value of 4.579999999999997
Basically what you need to worry about in floating point numbers is that there is a limited number of digits of precision. This can cause problems when testing for equality, or if your program actually needs more digits of precision than what that data type give you.
In C++, a good rule of thumb is to think that a float gives you 7 digits of precision, while a double gives you 15. Also, if you are interested in knowing how to test for equality, you can look at this question thread.
The standard is IEEE 754.
Of course, there are other means to store numbers when IEE754 isn't good enough. Libraries like Java's
BigDecimal
are available for most platforms and map well to SQL's number type. Symbols can be used for irrational numbers, and ratios that can't be accurately represented in binary or decimal floating point can be stored as a ratio.Correct.
As the other posters already mentioned, almost exclusively IEEE754 and its successor IEEE754R. Googling it gives you thousand explanations together with bit patterns and their explanation. If you still have problems to get it, there are two still common FP formats: IBM and DEC-VAX. For some esoteric machines and compilers (BlitzBasic, TurboPascal) there are some odd formats.
Practically none, they are cross-language compatible.
Very rare occuring quirks:
IEEE754 defines sNaNs (signalling NaNs) and qNaNs (quiet NaNs). The former ones cause a trap which forces the processor to call a handler routine if loaded. The latter ones don't do this. Because language designers hated the possibility that sNaNs interrupt their workflow and supporting them enforce support for handler routines, sNaNs are almost always silently converted into qNaNs. So don't rely on a 1:1 raw conversion. But again: This is very rare and occurs only if NaNs are present.
You can have problems with endianness (the bytes are in the wrong order) if files between different computers are shared. It is easily detectable because you are getting NaNs for numbers.