Writing files in the background process

2019-07-19 06:21发布

问题:

I am trying to create a script that will process modified/new files in a directory (which is being mirrored from a remote directory by lftp but that's another story).

To keep track of the files that are modified I use fswatch. I then convert the files detected by fswatch from xml to json and store them in a separate directory. To make sure that I can stop this conversion once there are no more files to process (when the mirroring job is over) I keep track of a file that will be created by the mirroring process upon completion.

My script works, BUT for a strange reason I do not see the json files until the mirroring job is completed. It's as if the converted files are stored somewhere in memory and as soon as the 'stopping' condition is true those files magically appear in the directory.

Is this a normal behaviour? How can I make the files appear as soon as they are processed? In what ways can I optimize what I am trying to achieve? (I'm a newbie in bash... and programming in general.)

Here's the script that I use:

my_convert_xml_to_json_function () {
    if [ -f "$1" ]; then
        temporary_file_name_for_json=$(echo "${1/$path_to_xml_files\/}" | base64)
        xml2json < "$1" | jq -rc '.amf' > "${path_to_json_files}/${temporary_file_name_for_json}.txt"
    fi
}
export -f my_convert_xml_to_json_function
export path_to_xml_files
export path_to_json_files

# repeat watching for files until the mirroring is over
fswatch -0 --event Updated --event Created "${path_to_xml_files}" | grep -ai 'xml$' | xargs -0 -n 1 -I {} bash -c 'my_convert_xml_to_json_function "{}"' & 

temporary_pid_of_fswatch=`jobs -p`
echo "This is PID of the last bit in the pipeline: $!; this is PID of the fswatch: ${temporary_pid_of_fswatch}"


# now check for the existence of a stopping rule
while [[ $(shopt -s nullglob; set -- "${my_temporary_files}"/xml-mirrorring-started-on-*-is-completed.txt; echo $#) -eq 0 ]]; do
    # tell the script to stop and remove the file generated by the mirror into the trashcan
        sleep 1 && temp_continue_check="running `date`"
        echo "Stop condition met (${temp_continue_check})."
done && kill -15 "${temporary_pid_of_fswatch}" && mv -v "${my_temporary_files}"/xml-mirrorring-started-on-*-is-completed.txt "$my_trashcan"

EDIT: so following comment from @snorp, if I add sync to the script, then I am able to get 'real time' updating of the files. Otherwise, the files are somewhere in the air... if a process is running in the background and I type sync I get a new process that seems to 'freeze' (based on top output I can see it's doing something, but I don't see the processed files written into the folder like they should (eventually) be). Is there any way to force OSX to actually write these files to disk (without including sync in the script)?