0

I'm trying to resample some tick data I have into 1 minute blocks. The code appears to work fine but when I look into the resulting dataframe it is changing the order of the dates incorrectly. Below is what it looks like pre resample:

                    Var2    Var3    Var4    Var5    Var6    Var7    Var8    Var9    Var10
2020-06-30 17:00:00 41.68   2   tptBid  tctRegular  NaN 255 NaN 0   msNormal
2020-06-30 17:00:00 41.71   3   tptAsk  tctRegular  NaN 255 NaN 0   msNormal
2020-06-30 17:00:00 41.68   1   tptTradetctRegular  NaN 255 NaN 0   msNormal
2020-06-30 17:00:00 41.71   5   tptAsk  tctRegular  NaN 255 NaN 0   msNormal
2020-06-30 17:00:00 41.71   8   tptAsk  tctRegular  NaN 255 NaN 0   msNormal
... ... ... ... ... ... ... ... ... ...
2020-01-07 17:00:21 41.94   5   tptBid  tctRegular  NaN 255 NaN 0   msNormal
2020-01-07 17:00:27 41.94   4   tptBid  tctRegular  NaN 255 NaN 0   msNormal
2020-01-07 17:00:40 41.94   3   tptBid  tctRegular  NaN 255 NaN 0   msNormal
2020-01-07 17:00:46 41.94   4   tptBid  tctRegular  NaN 255 NaN 0   msNormal
2020-01-07 17:00:50 41.94   3   tptBid  tctRegular  NaN 255 NaN 0   msNormal

As you can see the date starts at 5pm on the 30th of June. Then I use this code:

one_minute_dataframe['Price'] = df.Var2.resample('1min').last()
one_minute_dataframe['Volume'] = df.Var3.resample('1min').sum()
one_minute_dataframe.index = pd.to_datetime(one_minute_dataframe.index)
one_minute_dataframe.sort_index(inplace = True)

And I get the following:

                    Price   Volume
2020-01-07 00:00:00 41.73   416
2020-01-07 00:01:00 41.74   198
2020-01-07 00:02:00 41.76   40
2020-01-07 00:03:00 41.74   166
2020-01-07 00:04:00 41.77   143
... ... ...
2020-06-30 23:55:00 41.75   127
2020-06-30 23:56:00 41.74   234
2020-06-30 23:57:00 41.76   344
2020-06-30 23:58:00 41.72   354
2020-06-30 23:59:00 41.74   451





        

It seems to want to start from midnight on the 1st of July. But I've tried sorting the index and it still is not changing.

Also, the datetime index seems to add lots more dates outside the ones that were originally in the dataframe and plonks them in the middle of the resampled one.

Any help would be great. Apologies if I've set this out poorly

4

1 回答 1

0

我明白发生了什么事。在数据下载的某个地方,月份和日期已被调换。这就是为什么它把七月放在首位,因为它认为现在是一月。

于 2020-08-04T12:33:20.153 回答