Bash: replace part of filename

2020-04-15 06:53发布

I have a command I want to run on all of the files of a folder, and the command's syntax looks like this:

tophat -o <output_file> <input_file>

What I would like to do is a script that loops over all the files in an arbitrary folder and also uses the input file names to create similar, but different, output file names. The file names looks like this:

input name                desired output name
path/to/sample1.fastq     path/to/sample1.bam
path/to/sample2.fastq     path/to/sample2.bam

Getting the input to work seems simple enough:

for f in *.fastq
do
     tophat -o <output_file> $f
done

I tried using output=${f,.fastq,.bam} and using that as the output parameter, but that doesn't work. All I get is an error: line 3: ${f,.fastq,.bam}: bad substitution. Is this the way to do what I want, or should I do something else? If it's the correct way, what am I doing wrong?

[EDIT]:

Thanks for all the answers! A bonus question, though... What if I have files named like this, instead:

path/to/sample1_1.fastq
path/to/sample1_2.fastq
path/to/sample2_1.fastq
path/to/sample2_2.fastq
...

... where I can have an arbitrary number of samples (sampleX), but all of them have two files associated with them (_1 and _2). The command now looks like this:

tophat -o <output_file> <input_1> <input_2>

So, there's still just the one output, for which I could do something like "${f/_[1-2].fastq/.bam}", but I'm unsure how to get a loop that only iterates once over every sampleX at the same time as taking both the associated files... Ideas?

[EDIT #2]:

So, this is the final script that did the trick!

for f in *_1.fastq
do
        tophat -o "${f/_1.fastq/.bam}"  $f "${f/_1.fastq/_2.fasq}"
done

3条回答
ら.Afraid
2楼-- · 2020-04-15 07:03

You can use:

tophat -o "${f/.fastq/.bam}" "$f"

Testing:

f='path/to/sample1.fastq'
echo "${f/.fastq/.bam}"
path/to/sample1.bam
查看更多
SAY GOODBYE
3楼-- · 2020-04-15 07:16

Alternative to anubhava's concise solution,

d=$(dirname path/to/sample1.fastq)
b=$(basename path/to/sample1.fastq .fastq)
echo $d/$b.fastq
path/to/sample1.fastq

tophat -o "$d/$b.fastq" "$f"
查看更多
We Are One
4楼-- · 2020-04-15 07:26

Not an answer but a suggestion: as a bioinformatician, you shoud use GNU make and its option -j (number of parallel jobs). The Makefile would be:

.PHONY:all
FASTQS=$(shell ls *.fastq)

%.bam: %.fastq
    tophat -o $@ $<

all:  $(FASTQS:.bam=.fastq)
查看更多
登录 后发表回答