问题:

I'm looking at using PipelineDB for analytics. For data warehousing I want to append all new data to a file, and tail -F it into psql like the examples on the website.

I have multiple data sources, so to get deterministic results, I'd like to append them all to the same input file, where they'll stay in the same order.

Is there a simple, idiomatic way of avoiding race conditions? Something like a single-file server I can pipe data to?

Edit:

Actually, a race condition is exactly what I want. But each line must be atomic, so no single line is ever corrupted. Lines may be interleaved, though.

回答1:

You could prepend/wrap all your writes with a mutex by using GNU Parallel like this:

sem --id atomicwrite echo hi >> file

So, to test it, run each of these in separate terminals:

for i in {0..999}; do sem --id atomicwrite echo hi >> file ; done

回答2:

You can simulate a mutex by using mkdir which is atomic create-and-check operation (this is ensured at the kernel level):

# locking example -- CORRECT
# Bourne
lockdir=/tmp/myscript.lock
if mkdir "$lockdir"
then    # directory did not exist, but was created successfully
    echo >&2 "successfully acquired lock: $lockdir"
    # continue script
else
    echo >&2 "cannot acquire lock, giving up on $lockdir"
    exit 0
fi

For more information (and other solutions) take a look to the FAQ:

http://mywiki.wooledge.org/BashFAQ/045