One of my applications runs about 100 workers. It started out as a threading
application, but performance (latency) issues were hit. So I converted those workers to multiprocessing.Process
es. The benchmark below shows that the reduction in load was achieved at the cost of more memory usage (factor 6).
So where precisely does the memory usage come from if Linux uses cow and the workers do not share any data?
How can I reduce the memory footprint? (Alternative question: How can I reduce the load for threading
?)
Benchmarks on Linux 2.6.26, 4 CPUs 2G RAM: (Note that cpu usage is given in % of one cpu, so full load is 400%. The numbers are derived from looking at Munin graphs.)
| threading | multiprocessing
------------------+-----------+----------------
memory usage | ~0.25GB | ~1.5GB
context switches | ~1.5e4/s | ~5e2/s
system cpu usage | ~30% | ~3%
total cpu usage | ~100% | ~50%
load avg | ~1.5 | ~0.7
Background: The application is processing events from the network and storing some of them in a MySQL database.