0

I have a DataFrame with a column of string values that I want to convert to date format. In R I usually break this down to a series of steps substituting the ??:?? with 12:00, stripping the time, appending it to the end and converting with a POSIX function.

I am trying to replicate that in Python in a more pythonic manner.

Here is an exceprt from my DataFrame (150,000 rows) as Series

index  date
0    21/08/2001 10:20
1     19/09/2005 9:50
2     ??:?? 04-Jun-01
3    16/08/2004 7:15 
4     ??:?? 04-Jan-01 
5     23/01/2001 9:25 
6    24/01/2001 11:16 
7     ??:?? 05-Feb-01 
8     24/01/2001 8:30 
9    24/01/2001 15:15

Here is what I have tried (I have called the excerpt tmp.):

I thought I could use list comprehensions and a regular expression replacement as follow:

[re.sub('\\?\\?:\\?\\?', '12:00', tmp) for i in tmp[i]]

What I would like to do is get the replacement of ??:?? with 12:00 and then generalise it so I can use it with tmp.apply.

任何建议表示赞赏。

4

1 回答 1

1

你可以使用Series.str.replace()

date = """21/08/2001 10:20
19/09/2005 9:50
??:?? 04-Jun-01
16/08/2004 7:15 
??:?? 04-Jan-01 
23/01/2001 9:25 
24/01/2001 11:16 
??:?? 05-Feb-01 
24/01/2001 8:30 
24/01/2001 15:15""".split("\n")

s = pd.Series(date)
s.str.replace("\?\?:\?\?", "12:00")
于 2013-03-30T03:33:20.013 回答