我试图找到我的数据有重复行的所有地方并删除重复行。另外,我正在寻找第 2 列的值为 90 的位置,并用我指定的特定数字替换以下第 2 列。
我的数据如下所示:
# Type Response Acc RT Offset
1 70 0 0 0.0000 57850
2 31 0 0 0.0000 59371
3 41 0 0 0.0000 60909
4 70 0 0 0.0000 61478
5 31 0 0 0.0000 62999
6 41 0 0 0.0000 64537
7 41 0 0 0.0000 64537
8 70 0 0 0.0000 65106
9 11 0 0 0.0000 66627
10 21 0 0 0.0000 68165
11 90 0 0 0.0000 68700
12 31 0 0 0.0000 70221
我希望我的数据看起来像:
# Type Response Acc RT Offset
1 70 0 0 0.0000 57850
2 31 0 0 0.0000 59371
3 41 0 0 0.0000 60909
4 70 0 0 0.0000 61478
5 31 0 0 0.0000 62999
6 41 0 0 0.0000 64537
8 70 0 0 0.0000 65106
9 11 0 0 0.0000 66627
10 21 0 0 0.0000 68165
11 90 0 0 0.0000 68700
12 5 0 0 0.0000 70221
我的代码:
BEGIN {
priorline = "";
ERROROFFSET = 50;
ERRORVALUE[10] = 1;
ERRORVALUE[11] = 2;
ERRORVALUE[12] = 3;
ERRORVALUE[30] = 4;
ERRORVALUE[31] = 5;
ERRORVALUE[32] = 6;
ORS = "\n";
}
NR == 1 {
print;
getline;
priorline = $0;
}
NF == 6 {
brandnewline = $0
mytype = $2
$0 = priorline
priorField2 = $2;
if (mytype !~ priorField2) {
print;
priorline = brandnewline;
}
if (priorField2 == "90") {
mytype = ERRORVALUE[mytype];
}
}
END {print brandnewline}
##Here the parameters of the brandnewline is set to the current line and then the
##proirline is set to the line on which we just worked on and the brandnewline is
##set to be the next new line we are working on. (i.e line 1 = brandnewline, now
##we set priorline = brandnewline, thus priorline is line 1 and brandnewline takes
##on line 2) Next, the same parameters were set with column 2, mytype being the
##current column 2 value and priorField2 being the same value as mytype moves to
##the next column 2 value. Finally, we wrote an if statement where, if the value
##in column 2 of the current line !~ (does not equal) value of column two of the
##previous line, then the current line will be print otherwise it will just be
##skipped over. The second if statement recognizes the lines in which the value
##90 appeared and replaces the value in column 2 with a previously defined
##ERRORVALUE set for each specific type (type 10=1, 11=2,12=3, 30=4, 31=5, 32=6).
我已经能够成功删除重复的行,但是,我无法执行我的代码的下一部分,即替换我在 BEGIN 中指定的值作为 ERRORVALUES (10=1, 11=2, 12=3 , 30=4, 31=5, 32=6) 与包含该值的实际列。本质上,我只想用我的 ERRORVALUE 替换该行中的值。
如果有人可以帮助我,我将不胜感激。