3

我有一个文件是'|' 划定的。文件中的字段之一是时间戳。该字段采用以下格式:MM-dd-yyyy HH:mm:ss我希望能够打印到文件中的唯一日期。我可以使用 cut 命令 ( cut -f1 -d'|' _file_name_ |sort|uniq) 来提取唯一日期。但是,对于该领域的时间部分,我看到了数百个结果。运行 cut 命令后,我想使用前 11 个字符的子字符串来显示唯一日期。我尝试使用 awk 命令,例如: awk ' { print substr($1,1-11) }' | cut -f1 -d'|' _file_name_ |sort|uniq > _output_file_

我没有运气。我会以错误的方式解决这个问题吗?有没有更简单的方法来提取我需要的数据。任何帮助,将不胜感激。

4

3 回答 3

5

cut -c1-11将显示每个输入行的字符 1-11。

于 2011-03-28T16:14:37.107 回答
4

if the date is the first (space separated) field in the file, then the list of unique dates is just:

cut -f1 -d' ' filename | sort -u

Update: in addition to @shellter's correct answer, I'll just present an alternative to demonstrate other awk facilities:

awk '{split($10, a); date[a[1]]++} END {for (d in date) print d}' filename
于 2011-03-28T16:20:00.893 回答
3

You're all most there. This is based on the idea that the date time stamp is in field 1.

Edit : changed field to 10, also used -u option to sort instead of sep process with uniq

You don't need the cut, awk will do that for you.

awk -F"|" ' { print substr($10,1,11) }'  _file_name_ |sort -u > _output_file_

I hope this helps.

P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer

于 2011-03-28T16:20:18.913 回答