Ensuring that my program is not doing a concurrent

2019-01-20 02:33发布

I am writing a script that is required to perform safe-writes to any given file i.e. append a file if no other process is known to be writing into it. My understanding of the theory was that concurrent writes were prevented using write locks on the file system but it seems not to be the case in practice.

Here's how I set up my test case: I am redirecting the output of a ping command:

ping 127.0.0.1 > fileForSafeWrites.txt

On the other end, I have the following python code attempting to write to the file:

handle = open('fileForSafeWrites.txt', 'w')
handle.write("Probing for opportunity to write")
handle.close()

Running concurrently both processes gracefully complete. I see that fileForSafeWrites.txt has turned into a file with binary content, instead of a write lock issued by the first process that protects it from being written into by the Python code.

How do I force either or both of my concurrent processes not to interfere with each other? I have read people advise the ability to get a write file handle as evidence for the file being write to safe, such as in https://stackoverflow.com/a/3070749/1309045

Is this behavior specific to my Operating System and Python. I use Python2.7 in an Ubuntu 12.04 environment.

2条回答
劳资没心,怎么记你
2楼-- · 2019-01-20 03:14

Inspired from a solution described for concurrency checks, I came up with the following snippet of code. It works if one is able to appropriately predict the frequency at which the file in question is written. The solution is through the use of file-modification times.

import os
import time

'''Find if a file was modified in the last x seconds given by writeFrequency.'''
def isFileBeingWrittenInto(filename, 
                       writeFrequency = 180, overheadTimePercentage = 20):

    overhead = 1+float(overheadTimePercentage)/100 # Add some buffer time
    maxWriteFrequency = writeFrequency * overhead
    modifiedTimeStart = os.stat(filename).st_mtime # Time file last modified
    time.sleep(writeFrequency)                     # wait writeFrequency # of secs
    modifiedTimeEnd = os.stat(filename).st_mtime   # File modification time again
    if 0 < (modifiedTimeEnd - modifiedTimeStart) <= maxWriteFrequency:
        return True
    else:
        return False

if not isFileBeingWrittenInto('fileForSafeWrites.txt'):
    handle = open('fileForSafeWrites.txt', 'a')
    handle.write("Text written safely when no one else is writing to the file")
    handle.close()

This does not do true concurrency checks but can be combined with a variety of other methods for practical purposes to safely write into a file without having to worry about garbled text. Hope it helps the next person searching for a way to do this.

EDIT UPDATE:

Upon further testing, I encountered a high-frequency write process that required the conditional logic to be modified from

if 0 < (modifiedTimeEnd - modifiedTimeStart) < maxWriteFrequency 

to

if 0 < (modifiedTimeEnd - modifiedTimeStart) <= maxWriteFrequency 

That makes a better answer, in theory and in practice.

查看更多
我只想做你的唯一
3楼-- · 2019-01-20 03:16

Use the lockfile module as shown in Locking a file in Python

查看更多
登录 后发表回答