我有一个包含 2 列计数器和历史记录的数据框,如下所示
*counter History*
1 Log Type: customer chat
chat history:
xxxxxxxxx
xxxxxxx
xxxxxxxxxxxxxxx
May 10 2020 23:34:57 +GMT 05:30
--------------------------------------------
log type: Phone call
issue type: xxxxxx
issue:
qqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqq
May 11 2020 08:54:54 + GMT 05:30
----------------------------------------------
log type: phone call
issue:
eeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeee
eeeeeeeeeee
eeeeeeeeeeee
eeeeeeeeeeeeeeeeeee
May 11 2020 14:58:54 + GMT 05:30
----------------------------------
2 Log Type: Phone call
issue:
xxxxxxxxx
xxxxxxx
xxxxxxxxxxxxxxx
May 10 2020 23:34:57 +GMT 05:30
--------------------------------------------
log type: Phone call
issue type: xxxxxx
issue:
qqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqq
May 11 2020 08:54:54 + GMT 05:30
----------------------------------------------
log type: phone call
issue:
eeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeee
eeeeeeeeeee
eeeeeeeeeeee
eeeeeeeeeeeeeeeeeee
May 12 2020 14:58:54 + GMT 05:30
----------------------------------------------
现在我想做一个检查,如果日志类型只显示电话,那么它应该计算唯一的日期戳,即如果 2 个日期戳相同,则计数应该是 1 在这种情况下不需要时间戳。所需的输出如下
counter History count
0 Log Type: customer chat 1
chat history:
xxxxxxxxx
xxxxxxx
xxxxxxxxxxxxxxx
May 10 2020 23:34:57 +GMT 05:30
--------------------------------------------
log type: Phone call
issue type: xxxxxx
issue:
qqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqq
May 11 2020 08:54:54 + GMT 05:30
----------------------------------------------
log type: phone call
issue:
eeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeee
eeeeeeeeeee
eeeeeeeeeeee
eeeeeeeeeeeeeeeeeee
May 11 2020 14:58:54 + GMT 05:30
----------------------------------
1 Log Type: Phone call 3
issue:
xxxxxxxxx
xxxxxxx
xxxxxxxxxxxxxxx
May 10 2020 23:34:57 +GMT 05:30
--------------------------------------------
log type: Phone call
issue type: xxxxxx
issue:
qqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqq
May 11 2020 08:54:54 + GMT 05:30
----------------------------------------------
log type: phone call
issue:
eeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeee
eeeeeeeeeee
eeeeeeeeeeee
eeeeeeeeeeeeeeeeeee
May 12 2020 14:58:54 + GMT 05:30
----------------------------------------------
我使用的代码是
import datetime
from dateparser.search import search_dates
def extract_sentence(input, word):
return ".".join((sentence for sentence in input.split("----") if word in sentence))
df2.reset_index(inplace=True)
lst_3=[]
ind_1=[]
for i in range (0,len(df2('counter'):
matches = search_dates(extract_sentence(df2['History'][i],'Phone call'))
lst_4=[]
for x in matches:
date_string = x[1]
lst_4.append(date_string)
lst_6=[]
lst_5=[]
for item in lst_4:
lst_5.append(str(item))
for i in lst_5:
ab=i[0:10]
lst_6.append(ab)
res = [i for i in lst_6 if '2020' in i or '2019' in i or '2018' in i or '2017' in i or '2016' in i or '2015' in i]
lst_8=[]
lst_8=len(list(set(res)))
lst_1=[]
try:
for match in matches:
lst=match
lst_1.append(lst)
except TypeError:
continue
ind=i
ind_1.append(ind)
lst_2=len(list(set(lst_1)))
lst_3.append(lst_2)
df3=pd.DataFrame({'Index1': ind_1,'Count3': lst_3})
df2.reset_index(inplace=True)
df2['Index1']= np.arange(len(df2))
df4=pd.merge(df2,df3[['Index1','Count3']],on='Index1',how='left')
我在运行时遇到的错误如下所示
TypeError: can't compare offset-naive and offset-aware datetimes
在这方面需要帮助