1

以下是我的 OHLC 1 分钟数据。

2011-11-01,9:00:00,248.50,248.95,248.20,248.70
2011-11-01,9:01:00,248.70,249.00,248.65,248.85
2011-11-01,9:02:00,248.90,249.25,248.70,249.15
...
2011-11-01,15:03:00,250.25,250.30,250.05,250.15
2011-11-01,15:04:00,250.15,250.60,250.10,250.60
2011-11-01,15:15:00,250.55,250.55,250.55,250.55
2011-11-02,9:00:00,245.55,246.25,245.40,245.80
2011-11-02,9:01:00,245.85,246.40,245.75,246.35
2011-11-02,9:02:00,246.30,246.45,245.75,245.80
2011-11-02,9:03:00,245.75,245.85,245.30,245.35
...

我加载了数据,这是数据:

                          2       3       4       5
0_1                                                                    
2011-11-01 09:00:00  248.50  248.95  248.20  248.70
2011-11-01 09:01:00  248.70  249.00  248.65  248.85
2011-11-01 09:02:00  248.90  249.25  248.70  249.15
2011-11-01 09:03:00  249.20  249.60  249.10  249.60
2011-11-01 09:04:00  249.55  249.95  249.50  249.60

为了使用 groupby,我想添加 4 列,如下所示:

                          2       3       4       5    year month day time
0_1                                                                    
2011-11-01 09:00:00  248.50  248.95  248.20  248.70       0      0  0    0
2011-11-01 09:01:00  248.70  249.00  248.65  248.85       0      0  0    1
2011-11-01 09:02:00  248.90  249.25  248.70  249.15       0      0  0    2
2011-11-01 09:03:00  249.20  249.60  249.10  249.60       0      0  0    3
2011-11-01 09:04:00  249.55  249.95  249.50  249.60       0      0  0    4
....
2011-11-02 09:00:00  248.50  248.95  248.20  248.70       0      0  1    0
2011-11-02 09:01:00  248.70  249.00  248.65  248.85       0      0  1    1
2011-11-02 09:02:00  248.90  249.25  248.70  249.15       0      0  1    2
2011-11-02 09:03:00  249.20  249.60  249.10  249.60       0      0  1    3
2011-11-02 09:04:00  249.55  249.95  249.50  249.60       0      0  1    4

如何添加此类索引列?

先感谢您。

4

1 回答 1

3

您可以使用库中的relativedelta函数来执行此操作dateutil

from dateutil.relativedelta import relativedelta
start = df.index[0]
def func(item):
    delta = relativedelta(item, start)
    return (delta.years, delta.months, delta.days)

>>>> pd.DataFrame(list(df.index.map(func)),
                  index=df.index, columns=['year', 'month', 'day'])

                     year  month  day
0_1                                  
2011-11-01 09:00:00     0      0    0
2011-11-01 09:01:00     0      0    0
2011-11-01 09:02:00     0      0    0
2011-11-01 15:03:00     0      0    0
2011-11-01 15:04:00     0      0    0
2011-11-01 15:15:00     0      0    0
2011-11-02 09:00:00     0      0    1
2011-11-02 09:01:00     0      0    1
2011-11-02 09:02:00     0      0    1
2011-11-02 09:03:00     0      0    1

在此之后,您可以将其与索引上的 DataFrame 合并。

我不知道该time列代表什么?纪要?

于 2013-09-15T15:50:25.703 回答