How do I iterate through each line of a text file with Bash?
With this script:
echo "Start!"
for p in (peptides.txt)
do
echo "${p}"
done
I get this output on the screen:
Start!
./runPep.sh: line 3: syntax error near unexpected token `('
./runPep.sh: line 3: `for p in (peptides.txt)'
(Later I want to do something more complicated with $p
than just output to the screen.)
The environment variable SHELL is (from env):
SHELL=/bin/bash
/bin/bash --version
output:
GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
cat /proc/version
output:
Linux version 2.6.18.2-34-default (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
The file peptides.txt contains:
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL
A few more things not covered by other answers:
Reading from a delimited file
Reading from the output of another command, using process substitution
This approach is better than
command ... | while read -r line; do ...
because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.Reading from a null delimited input, for example
find ... -print0
Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?
Reading from more than one file at a time
Based on @chepner's answer here:
-u
is a bash extension. For POSIX compatibility, each call would look something likeread -r X <&3
.Reading a whole file into an array (Bash versions earlier to 4)
If the file ends with an incomplete line (newline missing at the end), then:
Reading a whole file into an array (Bash versions 4x and later)
or
And then
More about the shell builtins
read
andreadarray
commands - GNUMore about
IFS
- WikipediaRelated posts:
Use a while loop, like this:
Notes:
If you don't set the
IFS
properly, you will lose indentation.You should almost always use the -r option with read.
Don't read lines with
for
Suppose you have this file:
There are four elements that will alter the meaning of the file output read by many Bash solutions:
If you want the text file line by line including blank lines and terminating lines without CR, you must use a while loop and you must have an alternate test for the final line.
Here are the methods that may change the file (in comparison to what
cat
returns):1) Lose the last line and leading and trailing spaces:
(If you do
while IFS= read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt
instead, you preserve the leading and trailing spaces but still lose the last line if it is not terminated with CR)2) Using process substitution with
cat
will reads the entire file in one gulp and loses the meaning of individual lines:(If you remove the
"
from$(cat /tmp/test.txt)
you read the file word by word rather than one gulp. Also probably not what is intended...)The most robust and simplest way to read a file line-by-line and preserve all spacing is:
If you want to strip leading and trading spaces, remove the
IFS=
part:(A text file without a terminating
\n
, while fairly common, is considered broken under POSIX. If you can count on the trailing\n
you do not need|| [[ -n $line ]]
in thewhile
loop.)More at the BASH FAQ
This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.
This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.
Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.
I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:
This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.
Best of luck!
One way to do it is:
As pointed out in the comments, this has the side effects of trimming leading whitespace, interpretting backslash sequences, and skipping the trailing line if it's missing a terminating linefeed. If these are concerns, you can do:
Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:
Here, 10 is just an arbitrary number (different from 0, 1, 2).
If you don't want your read to be broken by newline character, use -
Then run the script with file name as parameter.