Merging large number of files into one

2019-03-31 11:36发布

问题:

I have around 30 K files. I want to merge them into one. I used CAT but I am getting this error.

cat *.n3 > merged.n3

-bash: /usr/bin/xargs: Argument list too long

How to increase the limit of using the "cat" command? Please help me if there is any iterative method to merge a large number of files.

回答1:

Here's a safe way to do it, without the need for find:

 printf '%s\0' *.n3 | xargs -0 cat > merged.txt

(I've also chosen merged.txt as the output file, as @MichaelDautermann soundly advises; rename to merged.n3 afterward).

Note: The reason this works is:

  • printf is a bash shell builtin, whose command line is not subject to the length limitation of command lines passed to external executables.
  • xargs is smart about partitioning the input arguments (passed via a pipe and thus also not subject to the command-line length limit) into multiple invocations so as to avoid the length limit; in other words: xargs makes as few calls as possible without running into the limit.
  • Using \0 as the delimiter paired with xargs' -0 option ensures that all filenames - even those with, e.g., embedded spaces or even newlines - are passed through as-is.


回答2:

The traditional way

> merged.n3
for file in *.n3
do
  cat "$file" >> merged.n3
done


回答3:

Try using "find":

 find . -name \*.n3 -exec cat {} > merged.txt \;

This "finds" all the files with the "n3" extension in your directory and then passes each result to the "cat" command.

And I set the output file name to be "merged.txt", which you can rename to "merged.n3" after you're done appending, since you likely do not want your new "merged.n3" file appending within itself.