When you write a number of doubles to a file, in which format are they stored? Is it in byte format or string format?
E.g. given 0.00083231. Is it stored with 10 bytes, where each byte represents one digit? Or is it stored as only 8 bytes, since the size of a double is 8 bytes?
Assume that the language used is C++.
If you choose to write text, e.g. with formatted output like file << x
, you get text.
If you choose to write bytes, e.g. with unformatted output like file.write(&x, sizeof x)
, you get bytes.
It depends on how you print the value
If you print the number as a binary value, it'll take sizeof(double)
bytes (which is not always 8) in the file and you can't read the value with a normal text viewer/editor. You must use a binary/hex editor to see it in binary format.
If you print the number using a text output function, the result depends on how you format it. If you use cout
or functions in std::printf
family using %f
format, the value will be printed using 6 significant digits so it'll take only 8 bytes in textual format at most. If you use a different length/width specifier (for example printf("%9.10f\n", 0.00083231)
then of course the real bytes printed will be different. Using another format will also result in different printed form outputs. For example %e
will print out the string in the scientific format which is 8.323100e-04
in your case, and take at least 12 bytes in the output string. %a
will print out the value in hexadecimal form which will be even longer except for values that are exactly representable in binary. See live example here
Question:
When you write a number of doubles to a file, in which format are they stored? Is it in byte format or string format?
It depends on which functions you use to write the numbers.
E.g.:
If you use fprintf
or printf
, the number will be written out in textual form, which, in your example, will be written as 0.000832
with the format "%lf"
and will take 8 bytes. You can change the format to change the number of bytes used to write out the number. The resulting output will be in human readable form. Same thing if you use cout << number;
.
If you use fwrite
, the number will be written in binary form. The number of bytes necessary to store the number will always be sizeof(double)
regardless of the value of the number. The resulting output will not be human readable. Same thing if you use ostream::write
.
It depends how you write them. You could use std::ostream and its (overloaded) operator <<
; then they are stored in textual form. You could use binary IO e.g. std::ostream::write or fwrite
then they are stored in native machine binary form.
You probably should read more about serialization, and consider using textual formats like JSON (e.g. with jsoncpp). You might be interested by binary serialization e.g. libs11n or XDR
Notice that data is often more important than code, and that disk IO or network IO is a lot (e.g. many thousand times at least) slower than CPU. So spending CPU times to make the data easier to store is often worthwhile. Also, the same data could be written on one machine, and read on some very different one.
Read also about persistence, databases, application checkpointing, endianness. See also this.