I'm looking at using PipelineDB for analytics. For data warehousing I want to append all new data to a file, and tail -F
it into psql
like the examples on the website.
I have multiple data sources, so to get deterministic results, I'd like to append them all to the same input file, where they'll stay in the same order.
Is there a simple, idiomatic way of avoiding race conditions? Something like a single-file server I can pipe data to?
Edit:
Actually, a race condition is exactly what I want. But each line must be atomic, so no single line is ever corrupted. Lines may be interleaved, though.
You could prepend/wrap all your writes with a mutex by using GNU Parallel like this:
So, to test it, run each of these in separate terminals:
You can simulate a mutex by using mkdir which is atomic create-and-check operation (this is ensured at the kernel level):
For more information (and other solutions) take a look to the FAQ:
http://mywiki.wooledge.org/BashFAQ/045