linux - 通过 Unix shell 脚本查找和删除重复的字符串。如何？

Question

我是一个 Unix shell 脚本新手。我知道几种不同的查找重复项的方法。但是找不到一种在保持原始顺序的同时删除重复项的简单方法（因为使用 sort -u 会丢失原始顺序）。

示例：脚本调用dedupe.sh

样品运行：

dedupe.sh

cat dog cat bird fish bear dog

结果是：cat dog bird fish bear

score 2 · Accepted Answer

使用awk：

$ printf '%s\n' cat dog cat bird fish bear dog | awk '!arr[$1]++'
cat
dog
bird
fish
bear

或者

$ echo 'cat dog cat bird fish bear dog' | awk '!arr[$1]++' RS=" "

或者

$ printf '%s\n' cat dog cat bird fish bear dog | sort -u

如果它在shell中工作，它将在脚本中工作 =)

score 1 · Accepted Answer

你说的是Perl吗？

perl -e 'while($_=shift@ARGV){$seen{$_}++||print}print"\n" ' \
cat dog cat bird fish bear dog

等效地，dedupe.pl包含：

#!/usr/bin/perl
while ($w = shift @ARGV) {
    $seen{$w}++ || print "$w";
}
print "\n";

现在chmod u+x dedupe.pl和：

./dedupe.pl cat dog cat bird fish bear dog

无论哪种方式，输出都是所需的。

cat dog bird fish bear

score 0 · Accepted Answer

啊 perl... 只写语言。:)

只要您调用另一种脚本语言，不妨考虑一些可读的东西。:)

#!/usr/bin/env ruby

puts ARGV.uniq.join(' ')

意思是：

puts = "print whatever comes next"
ARGV = "input argument array"
uniq = "array method to perform the behavior you're looking for and remove duplicates"
join(' ') = "join with spaces instead of default of newline. Not necessarily needed if you're piping to something else"

linux - 通过 Unix shell 脚本查找和删除重复的字符串。如何？

3 回答 3

Related

Reference