今天我在使用 linux sort 命令对文件进行排序时发现了一个问题。当我设置 env LANG=En_US 时,结果就是我所期望的。但是当 LANG=en_US 时,结果很奇怪。我运行的一些命令和输出如下:
[work@xx:/data1/muce_temp/datamarts/reduce_result_file/302/1d/201212260000]$ cat dd.dat
23 340_guard 16
23 340_guard 17
23 340_guard 18
23 360_guard... 16
23 360_guard 16
23 360_guard... 17
23 360_guard... 18
[work@xx:/data1/muce_temp/datamarts/reduce_result_file/302/1d/201212260000]$ LANG=En_US sort dd.dat
23 340_guard 16
23 340_guard 17
23 340_guard 18
23 360_guard 16
23 360_guard... 16
23 360_guard... 17
23 360_guard... 18
[work@xx:/data1/muce_temp/datamarts/reduce_result_file/302/1d/201212260000]$ LANG=en_US sort dd.dat
23 340_guard 16
23 340_guard 17
23 340_guard 18
23 360_guard... 16
23 360_guard 16 (why this line appear here ? )
23 360_guard... 17
23 360_guard... 18
此文件中行的格式详细信息如下:
2^E3^F360_guard^E...^I16^Ee^E17/18^I63776769$
2^E3^F360_guard^E^I16^Ee^E17/18^I63776769$
2^E3^F360_guard^E...^I17^Ei^E0^I63776771$
2^E3^F360_guard^E...^I18^Ei^E1^I63776773$
^E 是 '\x05' , ^F 是 '\x06' , ^I 是制表符, $ 是 '\n' 。
提前致谢。