regex - 在文件中添加和排序数字

Question

我有像这样的目录

./2012/NY/F/ 
./2012/NJ/M/ 
....

在这些目录下，有诸如Zoeetc 等名称的文件...

每个文件都包含一个数字。

我想将不同目录中具有相同文件名的文件中的数字相加并找到总和的最大值，我应该怎么写？

score 1 · Accepted Answer

如果您知道文件的唯一名称并且文件名中没有空格，那么以下可能会起作用。

cd 2012/
for i in "Zoe" "file2" "file3"
do 
  k=$(cat $(find . -type f -name "$i")); 
  echo $k | awk '{for(i=t=0;i<NF;) t+=$++i; $0=t}1';
done | sort -r

这将汇总下子目录中具有相同名称的文件，2012并将sort -r以最大到最小的顺序返回数字。

score 1 · Accepted Answer

假设你的./2012/NY/F，/2012/sfs/XXS都在目录下，比如说/home/yourusername/data/，

如果你使用 *nix 或者你的 Windows 上安装了 cygwin，你可以试试这个

    cd /home/yourusername/data ; find ./ -name yourfile_name_to_lookup.txt | xargs awk 'BEGIN {sum=0} ; {sum+=$1} ; END {print sum} '

我假设从该文件的第一列开始的数字 ( $1)。

score 1 · Accepted Answer

要查找文件，请使用此问题中指定的 glob 。

要进行实际的求和，根据文件的数量和数字的范围，有很多可能性，但一个合理的通用方法是使用 awk：

awk '{sum += $1} END { print sum }' file1 file2 ...

score 1 · Accepted Answer

我假设文件的全部内容是一个数字。我假设这个数字是一个整数。关联数组需要 bash 4

declare -A sum_for_file
for path in ./2012/*/*/*; do
    (( sum_for_file["$(basename "$path")"] += $(< "$path") ))
done

max=0
for file in "${!sum_for_file[@]}"; do
    if (( ${sum_for_file["$file"]} > max )); then
        max=${sum_for_file["$file"]}
        maxfile=$file
    fi
    # you didn't say you needed to print it, but if you do
    printf "%d\t%s\n" ${sum_for_file["$file"]} "$file"
done

echo "the maximum sum is $max found in files named $maxfile"

regex - 在文件中添加和排序数字

4 回答 4

Related

Reference