-2

在下面的字符串中,如何编写关于时间的正则表达式。输出应该只是时间戳。

 l1=May 30, 2012 at 8:13 AM  Comment · 1Like Unlike · Bookmark Unbookmark
 l2=yesterday at 12:13 AM  2Comment  Like Unlike · Bookmark Unbookmark
 l3=Two days ago at 01:18 AM  Comment · 5Like Unlike · Bookmark Unbookmark
 l4=Two days ago at 15:54 PM  Comment · Like Unlike · Bookmark Unbookmark

EDIT

 l5=Two days ago at 15:54:51 PM · Comment · Like Unlike · Bookmark Unbookmark

输出:

 array1 = [May 30, 2012 at 8:13 AM ,yesterday at 12:13 AM ,Two days ago at 01:18 AM,Two days ago at 15:54 PM]

 array2=[Comment · 1Like Unlike · Bookmark Unbookmark,2Comment · Like Unlike · Bookmark Unbookmark,Comment · 5Like Unlike · Bookmark Unbookmark,Comment · Like Unlike · Bookmark Unbookmark]

编辑:

p_date = re.compile(r'(\d{1,2}[:]\d{1,2}) but i wasnt sure how to do it if the timestamp was also like 23:12:29 
4

2 回答 2

2
>>> import re
>>> pattern = r'l\d+=(.*?)·(.*)'
>>> l1 = []
>>> l2 = []
>>> for line in s.split('\n'):
    m = re.match(pattern, line)
    if m:
        l1.append(m.groups()[0])
        l2.append(m.groups()[1])


>>> l1
['May 30, 2012 at 8:13 AM ', 'yesterday at 12:13 AM ', 'Two days ago at 01:18 AM ', 'Two days ago at 15:54 PM ']
>>> l2
[' Comment \xb7 1Like Unlike \xb7 Bookmark Unbookmark', ' 2Comment \xb7 Like Unlike \xb7 Bookmark Unbookmark', ' Comment \xb7 5Like Unlike \xb7 Bookmark Unbookmark', ' Comment \xb7 Like Unlike \xb7 Bookmark Unbookmark']
>>> 

编辑:添加匹配l1=以将其从匹配中删除。

于 2012-07-04T08:09:58.470 回答
0

你可以用“。”分割你的输出。, 如果你的输入格式是一致的。应用正则表达式来识别不同格式的时间戳字符串可能是一项繁重的任务。

于 2012-07-04T08:05:02.483 回答