Average of multiple files without considering miss

2019-03-02 16:19发布

I want to calculate the average of 15 files:- ifile1.txt, ifile2.txt, ....., ifile15.txt. Number of columns and rows of each file are same. But some of them are missing values. Part of the data looks as

 ifile1.txt      ifile2.txt       ifile3.txt
 3  ?  ?  ? .    1  2  1  3 .    4  ?  ?  ? .
 1  ?  ?  ? .    1  ?  ?  ? .    5  ?  ?  ? .
 4  6  5  2 .    2  5  5  1 .    3  4  3  1 .
 5  5  7  1 .    0  0  1  1 .    4  3  4  0 .
 .  .  .  . .    .  .  .  . .    .  .  .  . .  

I would like to find a new file which will show the average of these 15 fils without considering the missing values.

 ofile.txt
 2.66   2     1    3      . (i.e. average of 3 1 4, average of ? 2 ? and so on)
 2.33   ?     ?    ?      .
 3      5     4.33 1.33   .
 3      2.67  4    0.66   .
 .      .     .    .      .

This question is similar to my earlier question Average of multiple files in shell where the script was

awk 'FNR == 1 { nfiles++; ncols = NF }
      { for (i = 1; i < NF; i++) sum[FNR,i] += $i
       if (FNR > maxnr) maxnr = FNR
      }
      END {
      for (line = 1; line <= maxnr; line++)
      {
         for (col = 1; col < ncols; col++)
              printf "  %f", sum[line,col]/nfiles;
         printf "\n"
      }
    }' ifile*.txt

But I can't able to modify it.

3条回答
ら.Afraid
2楼-- · 2019-03-02 16:37
awk 'FNR == 1 { nfiles++; ncols = NF }
  { for (i = 1; i < NF; i++) 
        if ( $i != "?" ) { sum[FNR,i] += $i ; count[FNR,i]++ ;}
   if (FNR > maxnr) maxnr = FNR
  }
  END {
  for (line = 1; line <= maxnr; line++)
  {
     for (col = 1; col < ncols; col++)
          if ( count[line,col] > 0 ) printf "  %f", sum[line,col]/count[line,col];
          else printf " ? " ;
     printf "\n" ;
  }
}' ifile*.txt

I just check the '?' ...

查看更多
Summer. ? 凉城
3楼-- · 2019-03-02 16:38
awk '
   {
   for (i = 1;i <= NF;i++) {
      Sum[FNR,i]+=$i
      Count[FNR,i]+=$i!="?"
      }
   }
END {
   for( i = 1; i <= FNR; i++){
      for( j = 1; j <= NF; j++) printf "%s ", Count[i,j] != 0 ? Sum[i,j]/Count[i,j] : "?"
      print ""
      }
   }
' ifile*

assuming file are correctly feeded (no trailing empty space line, ...)

查看更多
看我几分像从前
4楼-- · 2019-03-02 16:43

Use this:

paste ifile*.txt | awk '{n=f=0; for(i=1;i<=NF;i++){if($i*1){f++;n+=$i}}; print n/f}'
  • paste will show all files side by side
  • awk calculates the averages per line:
    • n=f=0; set the variables to 0.
    • for(i=1;i<=NF;i++) loop trough all the fields.
    • if($i*1) if the field contains a digit (multiplication by 1 will succeed).
    • f++;n+=$i increment f (number of fields with digits) and sum up n.
    • print n/f calculate n/f.
查看更多
登录 后发表回答