bash: combine five lines of input to each line of

2019-07-03 21:57发布

I have a input file as follows:

MB1 00134141 
MB1 12415085 
MB1 13253590
MB1 10598105
MB1 01141484
...
...
MB1 10598105

I want to combine 5 lines and merge it into one line. I want my bash script to process the bash script to produce output as follows -

MB1 00134141 MB1 12415085 MB1 13253590 MB1 10598105 MB1 01141484
...
...
...                                                 

I have written following script and it works but it is slow for file of size 23051 lines. Can I write a better code to make it faster?

#!/bin/bash
file=timing.csv
x=0
while [ $x -lt $(cat $file | wc -l) ]
do
   line=`head -n $x $file | tail -n 1`
   echo -n $line " "
   let "remainder = $x % 5"
   if [ "$remainder" -eq 0 ] 
   then
        echo ""
   fi
   let x=x+1
done
exit 0

I tried to execute the following command but it messes up some numbers.

cat timing_deleted.csv | pr -at5

标签: bash shell unix
6条回答
该账号已被封号
2楼-- · 2019-07-03 22:32

Using sed, but this one will not process last few lines that do not add to a factor of 5:

 sed 'N;N;N;N;s/\n/ /g;' input_file

The N command reads the next line and appends it to the current line, preserving the newline. This script reads four additional lines for each line it reads, accumulating chunks of 5 lines in the buffer. For each such chunk, it replaces all of the newlines with a space.

查看更多
够拽才男人
3楼-- · 2019-07-03 22:36

A awk script would do that. A sed replace too, I guess. I don't know sed well, so here you go.

NF{ 
    if(i>=5){
        line = line "\n";
        i=0;
    }else{
        line = line " " $0;
        i++;
    }
}

END{
    print line;
}

Call that, say, merge.awk. Here is how you invoque it :

    awk -f merge.awk filetomerge.txt

or cat filetomerge.txt | awk -f merge.awk

Should be rather fast too.

查看更多
Evening l夕情丶
4楼-- · 2019-07-03 22:40

Use the paste command:

 paste -d ' ' - - - - - < tmp.txt

paste is far better, but I couldn't bring myself to delete my previous mapfile-based solution.

[UPDATE: mapfile reads too many lines prior to version 4.2.35 when used with -n]

#!/bin/bash
file=timing.csv
while true; do
    mapfile -t -n 5 arr
    (( ${#arr} > 0 )) || break
    echo "${arr[*]}"
done < "$file"
exit 0

We can't do while mapfile ...; do because mapfile exists with status 0 even when it doesn't read any input.

查看更多
放我归山
5楼-- · 2019-07-03 22:48

In pure bash, with no external processes (for speed):

while true; do
  out=()
  for (( i=0; i<5; i++ )); do
    read && out+=( "$REPLY" )
  done
  if (( ${#out[@]} > 0 )); then
    printf '%s ' "${out[@]}"
    echo
  fi
  if (( ${#out[@]} < 5 )); then break; fi
done <input-file >output-file

This correctly handles files where the number of lines is not a multiple of 5.

查看更多
Juvenile、少年°
6楼-- · 2019-07-03 22:52

Using tr:

cat input_file | tr "\n" " "
查看更多
▲ chillily
7楼-- · 2019-07-03 22:53

You can use xargs, if your input always contains a consistent number of spaces per line:

cat timing_deleted.csv | xargs -n 10

This will take the input from cat timing_deleted.csv and combine the input on 10 (-n 10) whitespace characters. The spaces in each column, such as MB1 00134141, count as a whitespace character - as well as the newline at the end of each line. So, for 5 lines, you'll need to use 10.

EDIT
As commented by Charles, you can skip the usage of cat and directly push the data into xargs with:

xargs -n 10 < timing_deleted.csv

I didn't notice any performance gains using a really large file, but it doesn't require multiple commands.

查看更多
登录 后发表回答