sorting - 如何比较两个文件并打印两个不同文件的值

Question

有2个文件。我需要先对它们进行排序，然后比较这两个文件，然后比较我需要打印文件 1 和文件 2 中的值的差异。

文件1：

pair,bid,ask
AED/MYR,3.918000,3.918000
AED/SGD,3.918000,3.918000
AUD/CAD,3.918000,3.918000

文件2：

pair,bid,ask
AUD/CAD,3.918000,3.918000
AUD/CNY,3.918000,3.918000
AED/MYR,4.918000,4.918000

输出应该是：

pair,inputbid,inputask,outputbid,outtputask
AED/MYR,3.918000,3.918000,4.918000,4.918000

2 个文件的唯一区别是 AED/MYR 具有不同的买入/卖出率。如何打印文件 1 和文件 2 的差异值。

我尝试使用以下命令：

nawk -F, 'NR==FNR{a[$1]=$4;a[$2]=$5;next} !($4 in a) || !($5 in a) {print $1 FS a[$1] FS a[$2] FS $4 FS $5}' file1 file2

结果输出如下：

pair,bid,ask,bid,ask
AUD/CAD,3.918000,3.918000,3.918000,3.918000
AUD/CHF,3.918000,3.918000,3.918000,3.918000
AUD/CNH,3.918000,3.918000,3.918000,3.918000
AUD/CNY,3.918000,3.918000,3.918000,3.918000
AED/MYR,3.918000,3.918000,4.918000,4.918000

我们仍然无法仅获得差异。

score 2 · Accepted Answer

您能否尝试awk使用显示的示例在 GNU 中进行跟踪、编写和测试。

awk -v header="pair,inputbid,inputask,outputbid,outtputask" '
BEGIN{
  FS=OFS=","
}
FNR==NR{
  arr[$1]=$0
  next
}
($1 in arr) && arr[$1]!=$0{
  val=$1
  $1=""
  sub(/^,/,"")
  if(!found){
    print header
    found=1
  }
  print arr[val],$0
}'  Input_file1  Input_file2

说明：为上述添加详细说明。

awk -v header="pair,inputbid,inputask,outputbid,outtputask" '  ##Starting awk program from here and setting this to header value here.
BEGIN{                                                         ##Starting BEGIN section of this program from here.
  FS=OFS=","                                                   ##Setting field separator and output field separator as comma here.
}
FNR==NR{                                                       ##Checking condition FNR==NR which will be TRUE when Input_file1 is being read.
  arr[$1]=$0                                                   ##Creating arr with index $1 and keep value as current line.
  next                                                         ##next will skip all further statements from here.
}
($1 in arr) && arr[$1]!=$0{                                    ##Checking condition if first field is present in arr and its value NOT equal to $0
  val=$1                                                       ##Creating val which has current line value in it.
  $1=""                                                        ##Nullifying irst field here.
  sub(/^,/,"")                                                 ##Substitute starting , with NULL here.
  if(!found){                                                  ##Checking if found is NULL then do following.
    print header                                               ##Printing header here only once.
    found=1                                                    ##Setting found here.
  }
  print arr[val],$0                                            ##Printing arr with index of val and current line here.
}' Input_file1  Input_file2                                    ##Mentioning Input_files here.

score 1 · Accepted Answer

使用bash过程替换，join然后选择 with awk：

# print header
printf "%s\n" "pair,inputbid,inputask,outputbid,outtputask"
# remove first line from both files, then sort them on first field
# then join them on first field and output first 5 fields
join -t, -11 -21 -o1.1,1.2,1.3,2.2,2.3 <(tail -n +2 file1 | sort -t, -k1) <(tail -n +2 file2 | sort -t, -k1) |
# output only those lines, that columns differ
awk -F, '$2 != $4 || $3 != $5'

sorting - 如何比较两个文件并打印两个不同文件的值

2 回答 2

Related

Reference