2

I'm currently working on a project where we have a .properties file containing thousands of kvp's. Some of these kvp's exist multiple times... so I want to remove the duplicate lines (if they are identical of course). But I'm also afraid that some keys are duplicate, but have different values.

I'm pretty sure there are much easier ways to do it, but I want to pick up bash scripting as an additional skill, but... I basically have zero bash knowledge. Nonetheless I came up with the following solution, but I highly doubt this is the most efficient way to do this. Is there an easier way to do this?

#! /bin/bash

# Remove unique lines (key and value are equal)
sort $1 | uniq > temporary.tmp

# Find keys that are not unique
doubleKeys=`awk -F"=" '{print $1}' temporary.tmp | sort | uniq -d` 

if [ -z $doubleKeys ] ; then
   mv temporary.tmp final.txt
   echo "Removed doubles, final file is final.txt"
else
   echo $doubleKeys > DoubleKeys.log
   rm temporary.tmp
   echo "Double keys found with different values, see DoubleKeys.log"
fi
4

1 回答 1

1

到目前为止,代码看起来还不错。一些小问题

  • 你可以sort $1 | uniqsort -u $1
  • 第二个sort不是必需的,因为temporary.tmp已经排序
  • 替代awk -F=可能是cut -d= -f1,但我不确定它是否更有效

除非您运行多次,否则我不会花太多时间优化它。与每月等待一两分钟相比,轻松调整和摆弄花费更多。

于 2013-01-02T21:18:23.883 回答