Doing parallel processing in bash?

2020-02-07 19:20发布

I've thousands of png files which I like to make smaller with pngcrush. I've a simple find .. -exec job, but it's sequential. My machine has quite some resources and I'd make this in parallel.

The operation to be performed on every png is:

pngcrush input output && mv output input

Ideally I can specify the maximum number of parallel operations.

Is there a way to do this with bash and/or other shell helpers? I'm Ubuntu or Debian.

3条回答
Summer. ? 凉城
2楼-- · 2020-02-07 19:58

You can use custom find/xargs solutions (see Bart Sas' answer), but when things become more complex you have -at least- two powerful options:

  1. parallel (from package moreutils)
  2. GNU parallel
查看更多
做自己的国王
3楼-- · 2020-02-07 20:03

With GNU Parallel http://www.gnu.org/software/parallel/ it can be done like:

find /path -print0 | parallel -0 pngcrush {} {.}.temp '&&' mv {.}.temp {} 

Learn more:

查看更多
狗以群分
4楼-- · 2020-02-07 20:04

You can use xargs to run multiple processes in parallel:

find /path -print0 | xargs -0 -n 1 -P <nr_procs> sh -c 'pngcrush $1 temp.$$ && mv temp.$$ $1' sh

xargs will read the list of files produced by find (separated by 0 characters (-0)) and run the provided command (sh -c '...' sh) with one parameter at a time (-n 1). xargs will run <nr_procs> (-P <nr_procs>) in parallel.

查看更多
登录 后发表回答