I have a input file as follows:
MB1 00134141
MB1 12415085
MB1 13253590
MB1 10598105
MB1 01141484
...
...
MB1 10598105
I want to combine 5 lines and merge it into one line. I want my bash script to process the bash script to produce output as follows -
MB1 00134141 MB1 12415085 MB1 13253590 MB1 10598105 MB1 01141484
...
...
...
I have written following script and it works but it is slow for file of size 23051 lines. Can I write a better code to make it faster?
#!/bin/bash
file=timing.csv
x=0
while [ $x -lt $(cat $file | wc -l) ]
do
line=`head -n $x $file | tail -n 1`
echo -n $line " "
let "remainder = $x % 5"
if [ "$remainder" -eq 0 ]
then
echo ""
fi
let x=x+1
done
exit 0
I tried to execute the following command but it messes up some numbers.
cat timing_deleted.csv | pr -at5
Using sed, but this one will not process last few lines that do not add to a factor of 5:
The
N
command reads the next line and appends it to the current line, preserving the newline. This script reads four additional lines for each line it reads, accumulating chunks of 5 lines in the buffer. For each such chunk, it replaces all of the newlines with a space.A awk script would do that. A sed replace too, I guess. I don't know sed well, so here you go.
Call that, say, merge.awk. Here is how you invoque it :
or
cat filetomerge.txt | awk -f merge.awk
Should be rather fast too.
Use the paste command:
paste
is far better, but I couldn't bring myself to delete my previousmapfile
-based solution.[UPDATE:
mapfile
reads too many lines prior to version 4.2.35 when used with-n
]We can't do
while mapfile ...; do
becausemapfile
exists with status 0 even when it doesn't read any input.In pure bash, with no external processes (for speed):
This correctly handles files where the number of lines is not a multiple of 5.
Using tr:
You can use
xargs
, if your input always contains a consistent number of spaces per line:This will take the input from
cat timing_deleted.csv
and combine the input on 10 (-n 10
) whitespace characters. The spaces in each column, such asMB1 00134141
, count as a whitespace character - as well as the newline at the end of each line. So, for 5 lines, you'll need to use 10.EDIT
As commented by Charles, you can skip the usage of
cat
and directly push the data intoxargs
with:I didn't notice any performance gains using a really large file, but it doesn't require multiple commands.