How to implement common bash idioms in Python? [cl

2019-01-04 15:15发布

I currently do my textfile manipulation through a bunch of badly remembered AWK, sed, Bash and a tiny bit of Perl.

I've seen mentioned a few places that python is good for this kind of thing. How can I use Python to replace shell scripting, AWK, sed and friends?

17条回答
Juvenile、少年°
2楼-- · 2019-01-04 15:40

pythonpy is a tool that provides easy access to many of the features from awk and sed, but using python syntax:

$ echo me2 | py -x 're.sub("me", "you", x)'
you2
查看更多
我只想做你的唯一
3楼-- · 2019-01-04 15:40

While researching this topic, I found this proof-of-concept code (via a comment at http://jlebar.com/2010/2/1/Replacing_Bash.html) that lets you "write shell-like pipelines in Python using a terse syntax, and leveraging existing system tools where they make sense":

for line in sh("cat /tmp/junk2") | cut(d=',',f=1) | 'sort' | uniq:
    sys.stdout.write(line)
查看更多
疯言疯语
4楼-- · 2019-01-04 15:41

I just discovered how to combine the best parts of bash and ipython. Up to now this seems more comfortable to me than using subprocess and so on. You can easily copy big parts of existing bash scripts and e.g. add error handling in the python way :) And here is my result:

#!/usr/bin/env ipython3

# *** How to have the most comfort scripting experience of your life ***
# ######################################################################
#
# … by using ipython for scripting combined with subcommands from bash!
#
# 1. echo "#!/usr/bin/env ipython3" > scriptname.ipy    # creates new ipy-file
#
# 2. chmod +x scriptname.ipy                            # make in executable
#
# 3. starting with line 2, write normal python or do some of
#    the ! magic of ipython, so that you can use unix commands
#    within python and even assign their output to a variable via
#    var = !cmd1 | cmd2 | cmd3                          # enjoy ;)
#
# 4. run via ./scriptname.ipy - if it fails with recognizing % and !
#    but parses raw python fine, please check again for the .ipy suffix

# ugly example, please go and find more in the wild
files = !ls *.* | grep "y"
for file in files:
  !echo $file | grep "p"
# sorry for this nonsense example ;)

See IPython docs on system shell commands and using it as a system shell.

查看更多
家丑人穷心不美
5楼-- · 2019-01-04 15:41

Adding to previous answers: check the pexpect module for dealing with interactive commands (adduser, passwd etc.)

查看更多
欢心
6楼-- · 2019-01-04 15:46

You can use python instead of bash with the ShellPy library.

Here is an example that downloads avatar of Python user from Github:

import json
import os
import tempfile

# get the api answer with curl
answer = `curl https://api.github.com/users/python
# syntactic sugar for checking returncode of executed process for zero
if answer:
    answer_json = json.loads(answer.stdout)
    avatar_url = answer_json['avatar_url']

    destination = os.path.join(tempfile.gettempdir(), 'python.png')

    # execute curl once again, this time to get the image
    result = `curl {avatar_url} > {destination}
    if result:
        # if there were no problems show the file
        p`ls -l {destination}
    else:
        print('Failed to download avatar')

    print('Avatar downloaded')
else:
    print('Failed to access github api')

As you can see, all expressions inside of grave accent ( ` ) symbol are executed in shell. And in Python code, you can capture results of this execution and perform actions on it. For example:

log = `git log --pretty=oneline --grep='Create'

This line will first execute git log --pretty=oneline --grep='Create' in shell and then assign the result to the log variable. The result has the following properties:

stdout the whole text from stdout of the executed process

stderr the whole text from stderr of the executed process

returncode returncode of the execution

This is general overview of the library, more detailed description with examples can be found here.

查看更多
Deceive 欺骗
7楼-- · 2019-01-04 15:47

Any shell has several sets of features.

  • The Essential Linux/Unix commands. All of these are available through the subprocess library. This isn't always the best first choice for doing all external commands. Look also at shutil for some commands that are separate Linux commands, but you could probably implement directly in your Python scripts. Another huge batch of Linux commands are in the os library; you can do these more simply in Python.

    And -- bonus! -- more quickly. Each separate Linux command in the shell (with a few exceptions) forks a subprocess. By using Python shutil and os modules, you don't fork a subprocess.

  • The shell environment features. This includes stuff that sets a command's environment (current directory and environment variables and what-not). You can easily manage this from Python directly.

  • The shell programming features. This is all the process status code checking, the various logic commands (if, while, for, etc.) the test command and all of it's relatives. The function definition stuff. This is all much, much easier in Python. This is one of the huge victories in getting rid of bash and doing it in Python.

  • Interaction features. This includes command history and what-not. You don't need this for writing shell scripts. This is only for human interaction, and not for script-writing.

  • The shell file management features. This includes redirection and pipelines. This is trickier. Much of this can be done with subprocess. But some things that are easy in the shell are unpleasant in Python. Specifically stuff like (a | b; c ) | something >result. This runs two processes in parallel (with output of a as input to b), followed by a third process. The output from that sequence is run in parallel with something and the output is collected into a file named result. That's just complex to express in any other language.

Specific programs (awk, sed, grep, etc.) can often be rewritten as Python modules. Don't go overboard. Replace what you need and evolve your "grep" module. Don't start out writing a Python module that replaces "grep".

The best thing is that you can do this in steps.

  1. Replace AWK and PERL with Python. Leave everything else alone.
  2. Look at replacing GREP with Python. This can be a bit more complex, but your version of GREP can be tailored to your processing needs.
  3. Look at replacing FIND with Python loops that use os.walk. This is a big win because you don't spawn as many processes.
  4. Look at replacing common shell logic (loops, decisions, etc.) with Python scripts.
查看更多
登录 后发表回答