I have a poller running on a certain directory every 35s. The files are placed in this directory through a SFTP server. The problem is whenever the polling conflicts with the time when a file is being copied. It picks the incomplete file also which is not yet copied completely.
Can we know the status of a file whether it is in copying mode or copied mode?
Have the poller note file sizes. If the size did not change from one round to the next, the file is done downloading.
Can you influence the SFTP server? Can it create a marker file once the download is complete (e.g. '
.thisIsAFile.doc.done
')?There are several common strategies for file watchers to "know" a file is completely transferred
Poll with time interval, and treat the file to be completely transferred if file size is not changing within an interval. e.g. watch for file existence every 1 minute. Once you see the file exists, monitor its size for every 5 seconds. If file size stays constant for 30 seconds, then treat it as completely transferred.
Have the transfer process create a tagging file after file transfer. e.g. After it completed transferring the file
FOO.txt
, create an emptyFOO.txt.tag
. Your file watcher is going to check for existence ofFOO.txt.tag
and once it exists, you knowFOO.txt
has been completely transferredIn some special cases that the file is having special format (e.g. a special footer line) then your file watcher can poll the file and see the last lines, and see if they match with the desired pattern
Each method has its pros and cons:
Choose the one that suit your need