Find (bash command) doesn't work with subproce

2019-07-17 20:32发布

问题:

I have renamed a css class name in a number of (python-django) templates. The css files however are wide-spread across multiple files in multiple directories. I have a python snippet to start renaming from the root dir and then recursively rename all the css files.

from os import walk, curdir
import subprocess

COMMAND = "find %s -iname *.css | xargs sed -i s/[Ff][Oo][Oo]/bar/g"
test_command = 'echo "This is just a test. DIR: %s"'

def renamer(command):
  print command  # Please ignore the print commands.
  proccess = subprocess.Popen(command.split(), stdout = subprocess.PIPE)
  op = proccess.communicate()[0]
  print op

for root, dirs, files in walk(curdir):
  if root:
    command = COMMAND % root
    renamer(command)

It doesn't work, gives:

find ./cms/djangoapps/contentstore/management/commands/tests -iname *.css | xargs sed -i s/[Ee][Dd][Xx]/gurukul/g
find: paths must precede expression: |
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]

find ./cms/djangoapps/contentstore/views -iname *.css | xargs sed -i s/[Ee][Dd][Xx]/gurukul/g
find: paths must precede expression: |
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]

When I copy and run the same command (printed above), find doesn't error out and sed either gets no input files or it works.

What is wrong with the python snippet?

回答1:

You're not trying to run a single command, but a shell pipeline of multiple commands, and you're trying to do it without invoking the shell. That can't possibly work. The way you're doing this, | is just one of the arguments to find, which is why find is telling you that it doesn't understand that argument with that "paths must precede expression: |" error.

You can fix that by adding shell=True to your Popen.

But a better solution is to do the pipeline in Python and keep the shell out of it. See Replacing Older Functions with the subprocess Module in the docs for an explanation, but I'll show an example.

Meanwhile, you should never use split to split a command line. The best solution is to write the list of separate arguments instead of joining them up into a string just to split them out. If you must do that, use the shlex module; that's what it's for. But in your case, even that won't help you, because you're inserting random strings verbatim, which could easily have spaces or quotes in them, and there's no way anything—shlex or otherwise—can reconstruct the data in the first place.

So:

pfind = Popen(['find', root, '-iname', '*.css'], stdout=PIPE)
pxargs = Popen(['xargs', 'sed', '-i', 's/[Ff][Oo][Oo]/bar/g'], 
               stdin=pfind.stdout, stdout=PIPE)
pfind.stdout.close()
output = pxargs.communicate()

But there's an even better solution here.

Python has os.walk to do the same thing as find, you can simulate xargs easily, but there's really no need to do so, and it has its own re module to use instead of sed. So, why not use them?

Or, conversely, bash is much better at driving and connecting up simple commands than Python, so if you'd rather use find and sed instead of os.walk and re.sub, why write the driving script in Python in the first place?



回答2:

The problem is the pipe. To use a pipe with the subprocess module, you have to pass shell=True.