I have a dictionary that I want to write to a csv file, but the floats in the dictionary are rounded off when I write them to the file. I want to keep the maximum precision.
Where does the rounding occur and how can I prevent it?
What I did
I followed the DictWriter example here and I'm running Python 2.6.1 on Mac (10.6 - Snow Leopard).
# my import statements
import sys
import csv
Here is what my dictionary (d) contains:
>>> d = runtime.__dict__
>>> d
{'time_final': 1323494016.8556759,
'time_init': 1323493818.0042379,
'time_lapsed': 198.85143804550171}
The values are indeed floats:
>>> type(runtime.time_init)
<type 'float'>
Then I setup my writer and write the header and values:
f = open(log_filename,'w')
fieldnames = ('time_init', 'time_final', 'time_lapsed')
myWriter = csv.DictWriter(f, fieldnames=fieldnames)
headers = dict( (n,n) for n in fieldnames )
myWriter.writerow(headers)
myWriter.writerow(d)
f.close()
But when I look in the output file, I get rounded numbers (i.e., floats):
time_init,time_final,time_lapsed
1323493818.0,1323494016.86,198.851438046
< EOF >
This works but it is probably not the best/most efficient way:
It's a known bug^H^H^Hfeature. According to the docs:
"""... the value None is written as the empty string. [snip] All other non-string data are stringified with str() before being written."""
Don't rely on the default conversions. Use
repr()
for floats.unicode
objects need special handling; see the manual. Check whether the consumer of the file will accept the default format ofdatetime.x
objects for x in (datetime, date, time, timedelta).Update:
For float objects,
"%f" % value
is not a good substitute forrepr(value)
. The criterion is whether the consumer of the file can reproduce the original float object.repr(value)
guarantees this."%f" % value
doesn't.Notice that in the above, it appears by inspection of the strings produced that none of the
%f
cases worked. Before 2.7, Python'srepr
always used 17 significant decimal digits. In 2.7, this was changed to using the minimum number of digits that still guaranteedfloat(repr(v)) == v
. The difference is not a rounding error.Note the improved
repr()
results in the first column above.Update 2 in response to comment """And thanks for the info on Python 2.7. Unfortunately, I'm limited to 2.6.2 (running on the destination machine which can't be upgraded). But I'll keep this in mind for future scripts. """
It doesn't matter.
float('0.3333333333333333') == float('0.33333333333333331')
producesTrue
on all versions of Python. This means that you could write your file on 2.7 and it would read the same on 2.6, or vice versa. There is no change in the accuracy of whatrepr(a_float_object)
produces.It looks like csv is using float.__str__ rather than float.__repr__:
Looking at the csv source, this appears to be a hardwired behavior. A workaround is to cast all of the float values to their repr before csv gets to it. Use something like:
d = dict((k, repr(v)) for k, v in d.items())
.Here's a worked-out example:
This code produces the following output:
A more refined approach will take care to only make replacements for floats:
Note, I've just fixed this issue for Py2.7.3, so it shouldn't be a problem in the future. See http://hg.python.org/cpython/rev/bf7329190ca6