2

给定以下具有 60 个元素的 pandas 数据框。

import pandas as pd
data = [60,62.75,73.28,75.77,70.28
    ,67.85,74.58,72.91,68.33,78.59
    ,75.58,78.93,74.61,85.3,84.63
    ,84.61,87.76,95.02,98.83,92.44
    ,84.8,89.51,90.25,93.82,86.64
    ,77.84,76.06,77.75,72.13,80.2
    ,79.05,76.11,80.28,76.38,73.3
    ,72.28,77,69.28,71.31,79.25
    ,75.11,73.16,78.91,84.78,85.17
    ,91.53,94.85,87.79,97.92,92.88
    ,91.92,88.32,81.49,88.67,91.46
    ,91.71,82.17,93.05,103.98,105]

data_pd = pd.DataFrame(data, columns=["price"])

0是否有一个公式可以以这种方式重新调整它,以便对于从 index到 index大于 20 个元素的每个窗口i+1,数据被重新调整为 20 个元素?

这是一个循环,它使用用于重新缩放的数据创建窗口,我只是不知道针对手头的这个问题进行重新缩放的任何方法。关于如何做到这一点的任何建议?

miniLenght = 20
rescaledData = []
for i in range(len(data_pd)):
    if(i >= miniLenght):
        dataForScaling = data_pd[0:i]
        scaledDataToMinLenght = dataForScaling #do the scaling here so that the length of the rescaled data is always equal to miniLenght
        rescaledData.append(scaledDataToMinLenght)

基本上在重新缩放后rescaledData应该有 40 个数组,每个数组的长度为 20 个价格。

4

1 回答 1

3

通过阅读本文,您似乎正在将列表的大小重新调整为 20 个索引,然后在 20 个索引处插入数据。

我们将像他们一样制作索引 ( range(0, len(large), step = len(large)/miniLenght)),然后使用numpys interp - 有上百万种插值数据的方法。np.interp 使用线性插值,因此如果您要求例如索引 1.5,您将得到点 1 和 2 的平均值,依此类推。

所以,这里是你的代码的快速修改来做到这一点(注意,我们可能可以使用“滚动”完全矢量化它):

import numpy as np
miniLenght = 20
rescaledData = []

for i in range(len(data_pd)):
    if(i >= miniLenght):
        dataForScaling = data_pd['price'][0:i]
        #figure out how many 'steps' we have
        steps = len(dataForScaling)
        #make indices where the data needs to be sliced to get 20 points
        indices = np.arange(0,steps, step = steps/miniLenght)
        #use np.interp at those points, with the original values as given
        rescaledData.append(np.interp(indices, np.arange(steps), dataForScaling))

输出如预期:

[array([ 60.  ,  62.75,  73.28,  75.77,  70.28,  67.85,  74.58,  72.91,
         68.33,  78.59,  75.58,  78.93,  74.61,  85.3 ,  84.63,  84.61,
         87.76,  95.02,  98.83,  92.44]),
 array([ 60.    ,  63.2765,  73.529 ,  74.9465,  69.794 ,  69.5325,
         74.079 ,  71.307 ,  72.434 ,  77.2355,  77.255 ,  76.554 ,
         81.024 ,  84.8645,  84.616 ,  86.9725,  93.568 ,  98.2585,
         93.079 ,  85.182 ]),.....
于 2017-07-16T22:22:05.987 回答