I'm writing a Python backup script and I need to find the oldest file in a directory (and its sub-directories). I also need to filter it down to *.avi files only.
The script will always be running on a Linux machine. Is there some way to do it in Python or would running some shell commands be better?
At the moment I'm running df
to get the free space on a particular partition, and if there is less than 5 gigabytes free, I want to start deleting the oldest *.avi
files until that condition is met.
The os module provides the functions that you need to get directory listings and file info in Python. I've found os.walk to be especially useful for walking directories recursively, and os.stat will give you detailed info (including modification time) on each entry.
You may be able to do this easier with a simple shell command. Whether that works better for you or not depends on what you want to do with the results.
Hm. Nadia's answer is closer to what you meant to ask; however, for finding the (single) oldest file in a tree, try this:
With a little modification, you can get the
n
oldest files (similar to Nadia's answer):Note that using the
.endswith
method allows calls as:to select more than one extension.
Finally, should you want the complete list of files, ordered by modification time, in order to delete as many as required to free space, here's some code:
and note that the
reverse=True
brings the oldest files at the end of the list, so that for the next file to delete, you just do afile_list.pop()
.By the way, for a complete solution to your issue, since you are running on Linux, where the
os.statvfs
is available, you can do:statvfs.f_bfree
are the device free blocks andstatvfs.f_bsize
is the block size. We take therootfolder
statvfs, so mind any symbolic links pointing to other devices, where we could delete many files without actually freeing up space in this device.UPDATE (copying a comment by Juan):
Depending on the OS and filesystem implementation, you may want to multiply f_bfree by f_frsize rather than f_bsize. In some implementations, the latter is the preferred I/O request size. For example, on a FreeBSD 9 system I just tested, f_frsize was 4096 and f_bsize was 16384. POSIX says the block count fields are "in units of f_frsize" ( see http://pubs.opengroup.org/onlinepubs/9699919799//basedefs/sys_statvfs.h.html )
I think the easiest way to do this would be to use find along with ls -t (sort files by time).
something along these lines should do the trick (deletes oldest avi file under specified directory)
step by step....
find / -name "*.avi" - find all avi files recursively starting at the root directory
xargs ls -t - sort all files found by modification time, from newest to oldest.
tail -n 1 - grab the last file in the list (oldest)
xargs rm - and remove it
Here's another Python formulation, which a bit old-school compared to some others, but is easy to modify, and handles the case of no matching files without raising an exception.
Check out the linux command
find
.Alternatively, this post pipes together ls and tail to delete the oldest file in a directory. That could be done in a loop while there isn't enough free space.
For reference, here's the shell code that does it (follow the link for more alternatives and a discussion):
You can use stat and fnmatch modules together to find the files
ST_MTIME refere to the last modification time. You can choose another value if you want
Then you can order the list by time and delete according to it.