1

我有 2 个文件,file1 大约有 1M 行,file2 大约有 1k 行。

file1 包含一些字段(tower_id、user_id、signal_strength),如下所示:

"0001","00abcde","0.65"
"0002","00abcde","0.35"
"0005","00bcdef","1.0"
"0001","00cdefg","0.1"
"0003","00cdefg","0.4"
"0008","00cdefg","0.3"
"0009","00cdefg","0.2"

file2 包含其他字段(tower_id、x_position、y_position),如下所示:

"0001","34","22"
"0002","78","56"
"0003","12","32"
"0004","79","45"
"0005","36","37"
"0006","87","99"
"0007","27","93"
"0008","55","04"
"0009","02","03"

每个 user_id 的 signal_strength 总和为 1。我需要根据每个塔的信号强度,通过计算每个用户的塔数来计算用户位置,并计算 strength_signal 与 tower_position 值的乘积,例如这个:

"00abcde" --> 0.65*34+0.35*78, 0.65*22+0.35*56
"00bcdef" --> 1.0*36, 1.0*37
"00cdefg" --> 0.1*34+0.4*12+0.3*55+0.2*02, 0.1*22+0.4*32+0.3*04+0.2*03

所以输出文件应该看起来像这样(user_id、computed_x_position、computed_y_position):

00abcde,49.4,33.9
00bcdef,36,37
00cdefg,25.1,16.8

我的想法是使用 awk,以某种方式使用“seen”功能以及 file1 和 file2 作为输入文件(如awk 'NR==FNR {some commands} {print some values}' file1 file2 > outputfile ),但我不知道该怎么做。任何人都可以帮助我吗?

4

1 回答 1

1

这可能是你想要的:

awk -F '[,"]+' '
    NR==FNR { towx[$2] = $3; towy[$2] = $4; next }
            { usrx[$3] += towx[$2] * $4; usry[$3] += towy[$2] * $4 }
    END     { for (usr in usrx) printf "%s,%.1f,%.1f\n",
                                       usr, usrx[usr], usry[usr] }
' file2 file1 # file2 precedes file1
于 2022-01-17T23:20:34.723 回答