I have a DataFrame
with a column of string values that I want to convert to date format. In R I usually break this down to a series of steps substituting the ??:??
with 12:00
, stripping the time, appending it to the end and converting with a POSIX function.
I am trying to replicate that in Python in a more pythonic manner.
Here is an exceprt from my DataFrame
(150,000 rows) as Series
index date
0 21/08/2001 10:20
1 19/09/2005 9:50
2 ??:?? 04-Jun-01
3 16/08/2004 7:15
4 ??:?? 04-Jan-01
5 23/01/2001 9:25
6 24/01/2001 11:16
7 ??:?? 05-Feb-01
8 24/01/2001 8:30
9 24/01/2001 15:15
Here is what I have tried (I have called the excerpt tmp
.):
I thought I could use list comprehensions and a regular expression replacement as follow:
[re.sub('\\?\\?:\\?\\?', '12:00', tmp) for i in tmp[i]]
What I would like to do is get the replacement of ??:??
with 12:00
and then generalise it so I can use it with tmp.apply
.
任何建议表示赞赏。