我无法简单地先按带有字符串的列然后按日期时间列对熊猫数据框进行排序。这样做时,返回的日期是乱序的。我究竟做错了什么?
df 看起来像
Date Field 1
0 2013-07-01 00:00:00 1
1 2013-07-02 00:00:00 1
2 2013-07-03 00:00:00 1
3 2013-07-03 00:00:00 2
4 2013-07-05 00:00:00 2
5 2013-07-05 00:00:00 1
6 2013-07-08 00:00:00 2
7 2013-07-09 00:00:00 2
8 2013-07-11 00:00:00 2
9 2013-07-12 00:00:00 2
10 2013-07-15 00:00:00 1
11 2013-07-16 00:00:00 1
12 2013-07-17 00:00:00 1
13 2013-07-18 00:00:00 1
14 2013-07-19 00:00:00 1
创建数据框时,日期是一个对象,并使用以下方法转换为日期时间:
df['Date'] = df['Date'].apply(dateutil.parser.parse)
现在dtypes是:
Date datetime64[ns]
Field 1 int64
dtype: object
运行时
df.sort_index(by=['Field 1', 'Date'])
或者
df.sort(['Field 1','Date'])
我回来了:
Date Field 1
0 2013-07-01 00:00:00 1
1 2013-07-02 00:00:00 1
2 2013-07-03 00:00:00 1
10 2013-07-15 00:00:00 1
5 2013-07-05 00:00:00 1
11 2013-07-16 00:00:00 1
12 2013-07-17 00:00:00 1
13 2013-07-18 00:00:00 1
14 2013-07-19 00:00:00 1
8 2013-07-11 00:00:00 2
9 2013-07-12 00:00:00 2
3 2013-07-03 00:00:00 2
4 2013-07-05 00:00:00 2
6 2013-07-08 00:00:00 2
7 2013-07-09 00:00:00 2
我真正想要回来的是:
Date Field 1
0 2013-07-01 00:00:00 1
1 2013-07-02 00:00:00 1
2 2013-07-03 00:00:00 1
5 2013-07-05 00:00:00 1
10 2013-07-15 00:00:00 1
11 2013-07-16 00:00:00 1
12 2013-07-17 00:00:00 1
13 2013-07-18 00:00:00 1
14 2013-07-19 00:00:00 1
3 2013-07-03 00:00:00 2
4 2013-07-05 00:00:00 2
6 2013-07-08 00:00:00 2
7 2013-07-09 00:00:00 2
8 2013-07-11 00:00:00 2
9 2013-07-12 00:00:00 2