I need to get the full nanosecond-precision modified timestamp for each file in a Python 2 program that walks the filesystem tree. I want to do this in Python itself, because spawning a new subprocess for every file will be slow.
From the C library on Linux, you can get nanosecond-precision timestamps by looking at the st_mtime_nsec
field of a stat
result. For example:
#include <sys/stat.h>
#include <stdio.h>
int main() {
struct stat stat_result;
if(!lstat("/", &stat_result)) {
printf("mtime = %lu.%lu\n", stat_result.st_mtim.tv_sec, stat_result.st_mtim.tv_nsec);
} else {
printf("error\n");
return 1;
}
}
prints mtime = 1380667414.213703287
(/
is on an ext4 filesystem, which supports nanosecond timestamps, and the clock is UTC).
Similarly, date --rfc-3339=ns --reference=/
prints 2013-10-01 22:43:34.213703287+00:00
.
Python (2.7.3)'s os.path.getmtime(filename)
and os.lstat(filename).st_mtime
give the mtime as a float
. However, the result is wrong:
In [1]: import os
In [2]: os.path.getmtime('/') % 1
Out[2]: 0.21370339393615723
In [3]: os.lstat('/').st_mtime % 1
Out[3]: 0.21370339393615723
—only the first 6 digits are correct, presumably due to floating-point error.
Alternatively you coudl use the cffi library which works with Python 2 with the follwoing code (tested on LInux):
from __future__ import print_function
from cffi import FFI
ffi = FFI()
ffi.cdef("""
typedef long long time_t;
typedef struct timespec {
time_t tv_sec;
long tv_nsec;
...;
};
typedef struct stat {
struct timespec st_mtim;
...;
};
int lstat(const char *path, struct stat *buf);
""")
C = ffi.verify()
result = ffi.new("struct stat *")
p = C.lstat("foo.txt", result)
print("mtime = {0:d}.{1:09d}".format(result.st_mtim.tv_sec, result.st_mtim.tv_nsec))
This is identical in behavior to your C program in your question.
This produces the output:
$ ./test.py
mtime = 1381711568.315075616
Which has the same precision as your C program:
$ gcc test.c -o test
$ ./test
mtime = 1381711568.315075616
os.stat('/').st_mtime
is a float object, and the precision of float is too low for a timestamp with nanosecond,
Python’s underlying type for floats is an IEEE 754 double, which is
only good for about 16 decimal digits. With ten digits before the
decimal point, that leaves six for sub-second resolutions, which is
three short of the range required to preserve POSIX
nanosecond-resolution timestamps. via: This Week in Python Stupidity: os.stat, os.utime and Sub-Second Timestamps
If you can use Python 3, there's a new a attribute called st_mtime_ns
, which is st_mtime in nanoseconds. try it.
>>> os.stat('.').st_mtime
1381571932.044594
>>> os.stat('.').st_mtime_ns
1381571932044593972
References:
PEP 410 -- Use decimal.Decimal type for timestamps
os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution
add st_*time_ns fields to os.stat(), add ns keyword to os.utime(), os.utimens() expects a number of nanoseconds
I was going to say the same as glasslion -- Python converts st_mtime to floating point, which loses significant digits.
One alternative is to use the "ctypes" module, or cython, to access the C library directly, which should return a nice unsigned long in the nanosecond field, (can't give you an example because you didn't give any information about your operating system).