Pipe multiple files (gz) into C program

2019-08-16 12:41发布

问题:

I've written a C program that works when I pipe data into my program using stdin like:

gunzip -c IN.gz|./a.out

If I want to run my program on a list of files I can do something like:

for i `cat list.txt`
do
  gunzip -c $i |./a.out
done

But this will start my program 'number of files' times. I'm interested in piping all the files into the same process run.

Like doing

for i `cat list.txt`
do
  gunzip -c $i >>tmp
done
cat tmp |./a.out

How can I do this?

回答1:

There is no need for a shell loop:

gzip -cd $(<list.txt) | ./a.out

With the '-cd' option, gzip will uncompress a list of files to standard output (or you can use 'gunzip -c'). The $(<file) notation expands the contents of the named file as a list of arguments without launching a sub-process. It is equivalent to $(cat list.txt) otherwise.

However, if you feel you must use a loop, then simply pipe the output from the loop into a single instance of your program:

for i in `cat list.txt`
do
    gunzip -c $i
done |
./a.out

If the contents of the loop are more complex (than simply gunzipping a single file), this might be necessary. You can also use '{ ... }' I/O redirection:

{
cat /etc/passwd /etc/group
for i in `cat list.txt`
do
    gunzip -c $i
done
} |
./a.out

Or:

{
cat /etc/passwd /etc/group
for i in `cat list.txt`
do
    gunzip -c $i
done; } |
./a.out

Note the semi-colon; it is necessary with braces. In this example, it is essentially the same as using a formal sub-shell with parentheses:

(
cat /etc/passwd /etc/group
for i in `cat list.txt`
do
    gunzip -c $i
done
) |
./a.out

Or:

( cat /etc/passwd /etc/group
  for i in `cat list.txt`
  do
      gunzip -c $i
  done) |
./a.out

Note the absence of a semi-colon here; it is not needed. The shell is wonderfully devious on occasion. The braces I/O redirection can be useful when you need to group commands after the pipe symbol:

some_command arg1 arg2 |
{
first sub-command
second command
for i in $some_list
do
    ...something with $i...
done
} >$outfile 2>$errfile


回答2:

You should be able get one gunzip process unzip multiple files.

zcat $(cat list.txt) | ./a.out

(zcat is another way of calling gunzip -c on many systems and shows the parallel with cat; but check for gzcat if your system's zcat is actually uncompress.)

Alternatively you can use a sub shell.

(
  for i in $(cat list.txt)
  do
    gunzip -c "$i"
  done
) | ./a.out


回答3:

This is rather a shell question. But AFAIK you can do:

cat file* | your_program

or

for i in file*; do gunzip -c $i; done | your_program


回答4:

xargs is your friend

% cat list.txt | xargs gunzip -c | ./a.out

if the files in list.txt have spaces in them then you need to go through some extra hoops.



回答5:

If your program doesn't need to know when a particular input ends and another one begins, you can do this:

for i `cat list.txt`
do
  gunzip -c $i
done |./a.out

I hope it will help you Regards