I am trying to format the following awk command
awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}' file1.txt > file2.txt
for use in python subprocess popen. However i am having a hard time formatting it. I have tried solutions suggested in similar answers but none of them worked. I have also tried using raw string literals. Also i would not like to use shell=True as this is not recommended
Edit according to comment: The command i tried was
awk_command = """awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}' file1.txt > file2.txt"""
command_execute = Popen(shlex.split(awk_command))
However i get the following error upon executing this
KeyError: 'printf "chr%s\t%s\t%s\n", $1, $2-1, $2'
googling the error suggests this happens when a value is requested for an undefined key but i do not understand its context here
The simplest method, especially if you wish to keep the output redirection stuff, is to use
subprocess
withshell=True
- then you only need to escape Python special characters. The line, as a whole, will be interpreted by the default shell.Alternatively, you can replace the command line with an
argv
-type sequence and feed that tosubprocess
instead. Then, you need to provide stuff as the program would see it:Regarding the specific problems:
\t
and\n
became the literal tab and newline (try toprint awk_command
)using
shlex.split
is nothing different fromshell=True
- with an added unreliability since it cannot guarantee if would parse the string the same way your shell would in every case (not to mention the lack of transmutations the shell makes).Specifically, it doesn't know or care about the special meaning of the redirection part:
So, if you wish to use
shell=False
, do construct the argument list yourself.>
is the shell redirection operator. To implement it in Python, usestdout
parameter:To avoid starting a separate process, you could implement this particular
awk
command in pure Python.