Not sure this is the best title for this question but here goes.
Through python/Qt I started multiple processes of an executable. Each process is writing a large file (~20GB) to disk in chunks. I am finding that the first process to start is always the last to finish and continues on much, much longer than the other processes (despite having the same amount of data to write).
Performance monitors show that the process is still using the expected amount of RAM (~1GB), but the disk activity from the process has slowed to a trickle.
Why would this happen? It is as though the first process started somehow gets its' disk access 'blocked' by the other processes and then doesnt recover after the other processes have finished...
Would the OS (windows) be causing this? What can I do to alleviate this?
Parallelism (of any kind) only results in a speedup if you actually have the resources to solve the problem faster.
Before thinking of optimizing your program, you should carefully analyze what's causing it to run (subjectively) slow - the bottleneck.
While I know nothing about what sort bottleneck your program has, the fact that it writes a large quantity of data to disk is a good hint that it may be I/O bound.
When a program is I/O bound, the conventional single-machine parallelization techniques (threading, multiple processes) are worse than useless - they actually hurt performance, especially if you're dealing with a spinning disk. This happens because once you have more than one process accessing the disk at different places, the hard drive head has to seek between those.
The I/O scheduler of your OS can have a great impact on how slower performance becomes once you have multiple processes accessing I/O, and how processes are alloted disk accesses. You may consider switching your OS, but only if those multiple processes are needed in the first place.
With that being said, what can you do to get better (I/O) performance?
There are no guarantees as to fairness of I/O scheduling. What you're describing seems rather simple: the I/O scheduler, whether intentionally or not, gives a boost to new processes. Since your disk is tapped out, the order in which the processes finish is not under your control. You're most likely wasting a lot of disk bandwidth on seeks, due to parallel access from multiple processes.
TL;DR: Your expectation is unfounded. When I/O, and specifically the virtual memory system, is saturated, anything can happen. And so it does.