1

我有一个多索引数据框(但有更多列)

                    2020-12-22 09:47:50          2020-12-23 16:43:45     2020-12-22 15:00 
Lines VehicleNumber                                          
102   9405                            3                      NaN             3
      9415                          NaN                       NaN           NaN
      9416                          NaN                      NaN            NaN

现在我想对列进行排序,以便我将最早的日期作为第一列,将最新的日期作为最后一列。之后,我想删除不在两个日期之间的列,比如说2020-12-22 10:00:00 < date < 2020-12-23 10:00:00。我尝试转置数据框,但当我有一个多索引时它似乎不起作用。

预期输出:

                         2020-12-22 15:00         2020-12-23 16:43:45   
Lines VehicleNumber                                          
102   9405                            3                      NaN         
      9415                          NaN                      NaN        
      9416                          NaN                      NaN        

所以首先我们按日期对列进行排序,然后检查它们是否在两个日期之间: 2020-12-22 10:00:00 < date < 2020-12-23 10:00:00因此删除一列

4

1 回答 1

2

首先将str列转换为date time列:

In [2244]: df.columns = pd.to_datetime(df.columns)

df然后,根据排序datetimes

In [2246]: df = df.reindex(sorted(df.columns), axis=1)

假设您只想保留大于以下的列:

In [2251]: x = '2020-12-22 10:00:00'

使用List comprehension

In [2257]: m = [i for i in df.columns if i > pd.to_datetime(x)]

In [2258]: df[m]
Out[2258]: 
                     2020-12-22 15:00:00  2020-12-23 16:43:45
Lines VehicleNumber                                          
102   9405.0                         3.0                  NaN
9415  NaN                            NaN                  NaN
9416  NaN                            NaN                  NaN
于 2020-12-24T14:22:49.063 回答