subprocess.Popen taking too long on WSL Linux

2020-02-06 16:54发布

I have this subprocess.Popen() context manager:

with Popen(
    args=command, shell=False, stdout=PIPE, bufsize=1, universal_newlines=True
) as process:

    # TIMING
    start = timer()
    lines = list(process.stdout)
    end = timer()
    print('Time taken:', end - start) # 53.662078000000065 seconds -> Linux

    for _ in tqdm(iterable=lines, total=len(lines)):
        sleep(0.1)

if process.returncode != 0:
    raise CalledProcessError(returncode=process.returncode, cmd=process.args)

And it seems to take 53 seconds to process list(process.stdout) when running in a WSL Linux enviorment. However, when I run it in a Windows enviorment, it only takes 0.6 seconds. I'm finding it strange to see why the timings are so different.

I've tried using subprocess.run() and subprocess.check_output() instead, but they still lead to the same long lag before processing the tqdm() loop.

Am I missing something here? I've tried looking at the docs to see what are the differences using subprocess.Popen() in a Windows vs WSL Linux enviorment, but I'm still unsure what the issue is. Perhaps list(process.stdout) is unnecessary here, and there is a better way to store the lines from stdout.

Any sort of guidance would be very helpful here.

2条回答
放我归山
2楼-- · 2020-02-06 17:38

You will need to reevaluate that performance issue in Q3 2019, with WSL2.

See "Announcing WSL 2" from Craig Loewen

Changes in this new architecture will allow for: dramatic file system performance increases, and full system call compatibility, meaning you can run more Linux apps in WSL 2 such as Docker.

File intensive operations like git clone, npm install, apt update, apt upgrade, and more will all be noticeably faster.
The actual speed increase will depend on which app you’re running and how it is interacting with the file system.
Initial tests that we’ve run have WSL 2 running up to 20x faster compared to WSL 1 when unpacking a zipped tarball, and around 2-5x faster when using git clone, npm install and cmake on various projects.

Linux binaries use system calls to perform many functions such as accessing files, requesting memory, creating processes, and more.
In WSL 1 we created a translation layer that interprets many of these system calls and allows them to work on the Windows NT kernel. However, it’s challenging to implement all of these system calls, resulting in some apps being unable to run in WSL 1.
Now that WSL 2 includes its own Linux kernel it has full system call compatibility.

查看更多
Anthone
3楼-- · 2020-02-06 17:39

Windows Subsystem for Linux is a bit rubbish. It's got many, many bugs and it's significantly slower than it needs to be. This is just another bug manifesting itself. Here are some possible bottlenecks:

  • Slow context switching in WSL.
  • WSL not noticing that an entire process waiting for a pipe means that the other end of the pipe should be run now.
  • The child process being executed lazily.
  • Windows taking a while to figure out that it needs to use wsl.exe to launch the program (thanks RoadRunner!)
  • The usual overhead of Windows, plus the usual (comparatively small) overhead of Linux.
  • A poor choice of Ubuntu distro causing many unnecessary services to be running in systemd(?)
  • Windows deciding to run other stuff before the child process for some unknown reason.
  • Deliberate malice on the part of the Windows Subsystem for Linux developers, conspiring to "prove" that Windows is the superior operating system by setting up a strawman. Too silly.

There's nothing wrong with your Python code that would make this slow.

查看更多
登录 后发表回答