-->

How do I use parallel programming/multi threading

2019-06-16 08:06发布

问题:

This is my script:

#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest

for f in $sourcedir/*
do
    fbase=$(basename "$f")
    echo "Inside $fbase"
    zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz
    zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz
done

Here there are about 30 sub-directories in the directory 'source'. Each sub-directory has certain R1.fastq.gz files and R2.fastq.gz that I want to merge into one R1.fastq.gz and R2.fastq.gz file, then save the merged file to the destination directory. My code works fine but I need to speed it up because of the amount of data. I just want to know is there any way I can implement multi threaded programming in my script? How can I run my script so that multiple jobs run in parallel? New to bash scripting, so any help would be appreciated.

回答1:

The simplest way is to execute the commands in the background, by adding & to the end of the command:

#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest

for f in $sourcedir/*
do
    fbase=$(basename "$f")
    echo "Inside $fbase"
    zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
    zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &
done

From the bash manual:

If a command is terminated by the control operator ‘&’, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background. The shell does not wait for the command to finish, and the return status is 0 (true). When job control is not active (see Job Control), the standard input for asynchronous commands, in the absence of any explicit redirections, is redirected from /dev/null.



回答2:

I am not sure but you can try using & at the end of the command like this

zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &