6

我有一个包含 8 列的列表,其中前 6 列是相同的。

6   99999715    99999771    NM_001013399    0   -   23  0.0714286
6   99999715    99999771    NM_001013399    0   -   24  0.0178571
6   99999715    99999771    NM_001013399    0   -   25  0.1250000

我需要计算第 7 列和第 8 列的平均值,以及 $7*$8,并得到如下格式:

6   99999715    99999771    NM_001013399    0   -   ave($7) ave($8) ave($7*$8)

我该怎么做?谢谢

4

2 回答 2

12

没试过,但应该只是:

{sum7+=$7; sum8+=$8; mul+=$7*$8} END {print sum7/NR,sum8/NR,mul/NR}

应大众需求,我将添加printf。

{sum7+=$7; sum8+=$8; mul+=$7*$8}
END {printf "%s %4i %10.7f %10.7f\n", substr($0,0,49),sum7/NR,sum8/NR,mul/NR}
于 2012-05-05T17:12:47.790 回答
7
awk '
{
  if( common == "" ) { 
    fn=1   # field number
    cn=1   # column number
    tmp=$0
    f=0 
    while( match( tmp, /  *|$/ ) && f<=NF ) 
    {  f+=1
       cnA[fn]=cn           # column number of start of field fn
       cnZ[fn]=cn+RSTART-1  # column number of   end of field fn
       ++fn
       cn+=RSTART+RLENGTH-1
       tmp=substr( tmp, RSTART+RLENGTH )
    }
    common = substr($0,1,cnA[7]-1)
    dlim78 = substr($0,cnZ[7], cnZ[7]-cnA[7])
  }  
  print $0
  (f7+=$7)
  (f8+=$8)
}
END {
  p7=".0" # decimal places ($7)
  p8=".7" # decimal places ($8)
  pP=".7" # decimal places ($7*$8) 
  printf( "%s%"p7"f%s%"p8"f%s%"pP"f\n" ,
           common, f7/NR, dlim78, f8/NR, dlim78,f7*f8/NR )
}
' <<'EOF'
6   99999715    99999771    NM_001013399    0   -   23  0.0714286
6   99999715    99999771    NM_001013399    0   -   25  0.1250000
EOF

输出:

6   99999715    99999771    NM_001013399    0   -   23  0.0714286
6   99999715    99999771    NM_001013399    0   -   25  0.1250000
6   99999715    99999771    NM_001013399    0   -   24  0.0982143  4.7142864
于 2012-05-06T01:04:10.090 回答