Understanding Python fork and memory allocation er

2019-02-17 04:33发布

I have a memory intensive Python application (between hundreds of MB to several GB).
I have a couple of VERY SMALL Linux executables the main application needs to run, e.g.

child = Popen("make html", cwd = r'../../docs', stdout = PIPE, shell = True)
child.wait()

When I run these external utilities (once, at the end of the long main process run) using subprocess.Popen I sometimes get OSError: [Errno 12] Cannot allocate memory.
I don't understand why... The requested process is tiny!
The system has enough memory for many more shells.

I'm using Linux (Ubuntu 12.10, 64 bits), so I guess subprocess calls Fork.
And Fork forks my existing process, thus doubling the amount of memory consumed, and fails??
What happened to "copy on write"?

Can I spawn a new process without fork (or at least without copying memory - starting fresh)?

Related:

The difference between fork(), vfork(), exec() and clone()

fork () & memory allocation behavior

Python subprocess.Popen erroring with OSError: [Errno 12] Cannot allocate memory after period of time

Python memory allocation error using subprocess.Popen

1条回答
唯我独甜
2楼-- · 2019-02-17 05:09

It doesn't appear that a real solution will be forthcoming (i.e. an alternate implementation of subprocess that uses vfork). So how about a cute hack? At the beginning of your process, spawn a slave that hangs around with a small memory footprint, ready to spawn your subprocesses, and keep open communication to it throughout the life of the main process.

Here's an example using rfoo (http://code.google.com/p/rfoo/) with a named unix socket called rfoosocket (you could obviously use other connection types rfoo supports, or another RPC library):

Server:

import rfoo
import subprocess

class MyHandler(rfoo.BaseHandler):
    def RPopen(self, cmd):
        c = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
        c.wait()
        return c.stdout.read()

rfoo.UnixServer(MyHandler).start('rfoosocket')

Client:

import rfoo

# Waste a bunch of memory before spawning the child. Swap out the RPC below
# for a straight popen to show it otherwise fails. Tweak to suit your
# available system memory.
mem = [x for x in range(100000000)]

c = rfoo.UnixConnection().connect('rfoosocket')

print rfoo.Proxy(c).RPopen('ls -l')

If you need real-time back and forth coprocess interaction with your spawned subprocesses this model probably won't work, but you might be able to hack it in. You'll presumably want to clean up the available args that can be passed to Popen based on your specific needs, but that should all be relatively straightforward.

You should also find it straightforward to launch the server at the start of the client, and to manage the socket file (or port) to be cleaned up on exit.

查看更多
登录 后发表回答