I'm trying to use xargs in a shell script to run parallel instances of a function I've defined in the same script. The function times the fetching of a page, and so it's important that the pages are actually fetched concurrently in parallel processes, and not in background processes (if my understanding of this is wrong and there's negligible difference between the two, just let me know).
The function is:
function time_a_url ()
{
oneurltime=$($time_command -p wget -p $1 -O /dev/null 2>&1 1>/dev/null | grep real | cut -d" " -f2)
echo "Fetching $1 took $oneurltime seconds."
}
How does one do this with an xargs pipe in a form that can take number of times to run time_a_url in parallel as an argument? And yes, I know about GNU parallel, I just don't have the privilege to install software where I'm writing this.
Here's a demo of how you might be able to get your function to work:
$ f() { echo "[$@]"; }
$ export -f f
$ echo -e "b 1\nc 2\nd 3 4" | xargs -P 0 -n 1 -I{} bash -c f\ \{\}
[b 1]
[d 3 4]
[c 2]
The keys to making this work are to export
the function so the bash
that xargs
spawns will see it and to escape the space between the function name and the escaped braces. You should be able to adapt this to work in your situation. You'll need to adjust the arguments for -P
and -n
(or remove them) to suit your needs.
You can probably get rid of the grep
and cut
. If you're using the Bash builtin time
, you can specify an output format using the TIMEFORMAT
variable. If you're using GNU /usr/bin/time
, you can use the --format
argument. Either of these will allow you to drop the -p
also.
You can replace this part of your wget
command: 2>&1 1>/dev/null
with -q
. In any case, you have those reversed. The correct order would be >/dev/null 2>&1
.
On Mac OS X:
xargs: max. processes must be >0 (for: xargs -P [>0])
f() { echo "[$@]"; }
export -f f
echo -e "b 1\nc 2\nd 3 4" | sed 's/ /\\ /g' | xargs -P 10 -n 1 -I{} bash -c f\ \{\}
echo -e "b 1\nc 2\nd 3 4" | xargs -P 10 -I '{}' bash -c 'f "$@"' arg0 '{}'
If you install GNU Parallel on another system, you will see the functionality is in a single file (called parallel).
You should be able to simply copy that file to your own ~/bin.