unix - 如何从 unix 中的 cut 命令的结果中提取子字符串？

Question

我有一个文件是'|' 划定的。文件中的字段之一是时间戳。该字段采用以下格式：MM-dd-yyyy HH:mm:ss我希望能够打印到文件中的唯一日期。我可以使用 cut 命令 ( cut -f1 -d'|' _file_name_ |sort|uniq) 来提取唯一日期。但是，对于该领域的时间部分，我看到了数百个结果。运行 cut 命令后，我想使用前 11 个字符的子字符串来显示唯一日期。我尝试使用 awk 命令，例如： awk ' { print substr($1,1-11) }' | cut -f1 -d'|' _file_name_ |sort|uniq > _output_file_

我没有运气。我会以错误的方式解决这个问题吗？有没有更简单的方法来提取我需要的数据。任何帮助，将不胜感激。

score 5 · Accepted Answer

5

cut -c1-11将显示每个输入行的字符 1-11。

于 2011-03-28T16:14:37.107 回答

score 4 · Accepted Answer

if the date is the first (space separated) field in the file, then the list of unique dates is just:

cut -f1 -d' ' filename | sort -u

Update: in addition to @shellter's correct answer, I'll just present an alternative to demonstrate other awk facilities:

awk '{split($10, a); date[a[1]]++} END {for (d in date) print d}' filename

score 3 · Accepted Answer

You're all most there. This is based on the idea that the date time stamp is in field 1.

Edit : changed field to 10, also used -u option to sort instead of sep process with uniq

You don't need the cut, awk will do that for you.

awk -F"|" ' { print substr($10,1,11) }'  _file_name_ |sort -u > _output_file_

I hope this helps.

P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer

unix - 如何从 unix 中的 cut 命令的结果中提取子字符串？

3 回答 3

Related

Reference