3

我想找到从一月到十二月特定月份的任何三个州的平均降雨量,例如 CA、TX 和 AX。给定由 和 分隔的输入文件TAB SPACES,格式为 city name, the state , and then average rainfall amounts from January through December, and then an annual average for all months. EG 可能看起来像

AVOCA   PA  30  2.10    2.15    2.55    2.97    3.65    3.98    3.79    3.32     3.31   2.79    3.06    2.51    36.18
BAKERSFIELD CA  30  0.86    1.06    1.04    0.57    0.20    0.10    0.01    0.09    0.17    0.29    0.70    0.63    5.72

我想要做的是“获取特定月份 feb 的平均降雨量总和,例如 n 年,然后找到 CA、TX 和 AX 州的平均值。

我在 awk 中编写了下面的脚本来做同样的事情,但它没有给我预期的输出

/^CA$/ {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/^TX$/ {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/^AX$/ {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
END {
     CA_avg = CA_SUM/CA;
     TX_avg = TX_SUM/TX;
     AX_avg = AX_SUM/AX; 
     printf("CA Rainfall: %5.2f",CA_avg);
     printf("CA Rainfall: %5.2f",TX_avg);
     printf("CA Rainfall: %5.2f",AX_avg);
    }

我用命令调用程序 awk 'FS="\t"'-f awk1.awk rainfall.txt并没有看到任何输出。

问题:我在哪里滑倒?任何建议和更改的代码将不胜感激

4

2 回答 2

3

该模式/^CA$/意味着字符“C”和“A”是该行中唯一的字符。你要:

$2 == "CA" {CA++; CA_SUM+= $5}
# etc.

但是,这是 DRYer:

{ count[$2]++; sum[$2] += $5 }
END {
    for (state in count) {
        printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
    }
}

此外,这看起来是错误的:awk 'FS="\t"'-f awk1.awk rainfall.txt
尝试:awk -F '\t' -f awk1.awk rainfall.txt


回复评论:

awk -F '\t' -v month=2 -v states="CA,AZ,TX" '
    BEGIN {
        month_col = month + 3  # assume January is month 1
        split(states, wanted_states, /,/)
    }
    { count[$2]++; sum[$2] += $month_col }
    END {
        for (state in wanted_states) {
            if (state in count) {
                printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
            else
                print state " Rainfall: no data"
        }
    }
' rainfall.txt
于 2010-10-16T23:55:10.207 回答
2

你的正则表达式应该是

/ CA / {CA++; cA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 

/^AX$/ 仅当它是该行中的唯一单词时才匹配

编辑

/ CA / {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
END {

 if(CA!=0){CA_avg = CA_SUM/CA;     printf("CA Rainfall: %5.2f",CA_avg);}
 if(TX!=0){TX_avg = TX_SUM/TX;     printf("TX Rainfall: %5.2f",TX_avg);}
 if(AX!=0){TX_avg = AX_SUM/CA;     printf("AX Rainfall: %5.2f",AX_avg);}
}
于 2010-10-16T21:13:06.407 回答