bash - BASH while 循环检查文件中的行显示太多次

Question

我正在编写一个脚本，我想从一个文件中获取每一行并检查另一个文件中的匹配项。如果我找到匹配项，我想说我找到了匹配项，如果没有，说我没有找到匹配项。

这 2 个文件包含 md5 哈希。旧文件是原始文件，新文件是检查自原始文件以来是否有任何更改。

原始文件：chksum 新文件：chksum1

#!/bin/bash

while read e; do
     while read f; do
     if [[ $e = $f ]]
     then 
     echo $e "is the same"
     else
          if [[ $e != $f]]
          then
          echo $e "has been changed"
          fi
     fi
     done < chksum1
done < chksum

我的问题是，对于已更改的文件，每次循环检查完成时我都会得到一个回显，我只希望它显示一次文件并说它没有找到。

希望这很清楚。

score 0 · Accepted Answer

您可以使用相同的脚本，但要提醒一下。

#!/bin/bash

while read e; do
    rem=0
        while read f; do
        if [[ $e = $f ]]
        then 
            rem=1
        fi
        done < chksum1
    if [[ rem = 1 ]] 
    then
        echo $e "is the same"
    else
        echo $e "has been changed"
    fi
done < chksum

这应该可以正常工作

score 0 · Accepted Answer

你真的很亲密。这将起作用：

while read e; do
     while read f; do
     found=0
     if [[ $e = $f ]]
     then 
         # echo $e "is the same"
         found=1
         break
     fi
     done < chksum1
     if [ $found -ne 0 ]
     then
        echo "$e is the the same"
     else
        echo "$e has been changed"
     fi
done < chksum

score 0 · Accepted Answer

0

简单的解决方案：

diff -q chksum1 chksum

于 2013-06-14T08:49:41.970 回答

score 0 · Accepted Answer

我想建议一个替代解决方案：您不要逐行阅读，而是使用sortanduniq -c看看是否有差异。不需要一个简单的管道就可以完成工作的循环。

在这种情况下，您需要 file 中所有已更改的行chksum1，所以

sort chksum chksum1 chksum1 | uniq -c | egrep '^\s+2\s' | sed 's%\s\+2\s%%'

chksum1与基于循环的示例相比，这也仅读取2 次，后者每行读取一次chksum.

重用来自其他答案之一的输入文件：

samveen@precise:~/so$ cat chksum
eed0fc0313f790cec0695914f1847bca  ./a.txt
9ee9e1fffbb3c16357bf80c6f7a27574  ./b.txt
a91a408e113adce865cba3c580add827  ./c.txt

samveen@precise:~/so$ cat chksum1
eed0fc0313f790cec0695914f1847bca  ./a.txt
8ee9e1fffbb3c16357bf80c6f7a27574  ./b.txt
a91a408e113adce865cba3c580add827  ./d.txt

samveen@precise:~/so$ sort chksum chksum1 chksum1 |uniq -c | egrep '^\s+2\s' |sed 's%\s\+2\s%%'
8ee9e1fffbb3c16357bf80c6f7a27574  ./b.txt
a91a408e113adce865cba3c580add827  ./d.txt

另一种可能的解决方案是（如问题评论中所建议的）与diff结合使用sort：

diff <(sort chksum) <(sort chksum1) |grep '^>'

输出：

samveen@precise:~/so$ diff <(sort chksum) <(sort chksum1) |grep '^>'
> 8ee9e1fffbb3c16357bf80c6f7a27574  ./b.txt
> a91a408e113adce865cba3c580add827  ./d.txt

score 0 · Accepted Answer

稍微简化的版本，避免多次读取同一文件（bash 4.0 及更高版本）。我假设文件包含唯一的文件名，文件格式是md5sum命令的输出。

#!/bin/bash

declare -A hash
while read md5 file; do hash[$file]=$md5; done <chksum
while read md5 file; do
  [ -z "${hash[$file]}" ] && echo "$file new file" && continue
  [ ${hash[$file]} == $md5 ] && echo "$file is same" && continue
  echo "$file has been changed"
done <chksum1

此脚本将第一个文件读取到关联数组，称为hash. index 是文件名，value 是 MD5 校验和。第二个循环读取第二个校验和文件；文件名不在hash打印中file new file；如果它在hash并且值等于，那么它是同一个文件；如果它不等于它写file has been changed。

输入文件：

$ cat chksum
eed0fc0313f790cec0695914f1847bca  ./a.txt
9ee9e1fffbb3c16357bf80c6f7a27574  ./b.txt
a91a408e113adce865cba3c580add827  ./c.txt
$ cat chksum1
eed0fc0313f790cec0695914f1847bca  ./a.txt
8ee9e1fffbb3c16357bf80c6f7a27574  ./b.txt
a91a408e113adce865cba3c580add827  ./d.txt

输出：

./a.txt is same
./b.txt has been changed
./d.txt new file

扩大的视野

还检测已删除的文件。

#!/bin/bash

declare -A hash
while read md5 file; do hash[$file]=$md5; done <chksum
while read md5 file; do
  [ -z "${hash[$file]}" ] && echo "$file new file" && continue
  if [ ${hash[$file]} == $md5 ]; then echo "$file is same"
  else echo "$file has been changed"
  fi
  unset hash[$file]
done <chksum1
for file in ${!hash[*]};{ echo "$file deleted file";}

输出：

./a.txt is same
./b.txt has been changed
./d.txt new file
./c.txt deleted file

score 0 · Accepted Answer

使用命令 grep 怎么样。您从 chksum 读取的每一行都将用作 chksum1 中的搜索模式。如果 grep 找到匹配项，则"$?"包含 grep 的返回值的将等于0，否则，它将等于1

while read e; do  
  grep $e checksum1
  if[ $? == "0" ];then
     echo $e "is the same"
  else
     echo $e "has been changed"
  fi
done < chksum

bash - BASH while 循环检查文件中的行显示太多次

6 回答 6

Related

Reference