0

我有以下数据,我正在尝试以下代码:

Name    Sensex_index    Start_Date       End_Date
AAA        0.5           20/08/2016    25/09/2016 
AAA        0.8           26/08/2016    29/08/2016 
AAA        0.4           30/08/2016    31/08/2016
AAA        0.9           01/09/2016    05/09/2016
AAA        0.5           12/09/2016    22/09/2016
AAA        0.3           24/09/2016    29/09/2016
ABC        0.9           01/01/2017    15/01/2017
ABC        0.5           23/01/2017    30/01/2017
ABC        0.7           02/02/2017    15/03/2017

所以我要做的是,如果(同名)的 sensex 索引从较低的索引增加并移动到较高的索引,则终止日期是以前的值,例如,我正在寻找以下输出。从上述数据类型中查找实际的开始和终止日期。

Name   Sensex_index  Actual_Start      Termination_Date 
AAA        0.5        20/08/2016          31/08/2016
AAA        0.8        20/08/2016          31/08/2016
AAA        0.4        20/08/2016          31/08/2016 [high to low; low to high,terminate]
AAA        0.9        01/09/2016          29/09/2016
AAA        0.5        01/09/2016          29/09/2016      
AAA        0.3        01/09/2016          29/09/2016 [end of AAA]
ABC        0.9        01/01/2017          30/01/2017  
ABC        0.5        01/01/2017          30/01/2017 [high to low; low to high,terminate]
ABC        0.7        02/02/2017          15/03/2017 [end of ABC]

我使用以下代码,以前可以使用,但现在出现索引错误,

#Find the rows where price change from high to low and then to high
df['change'] = df.groupby('Name')['Sensex_index'].apply(lambda x: x.rolling(3,center=True).apply(lambda y: True if (y[1]<y[0] and y[1]<y[2]) else False))
#Find the last row for each name
df.iloc[df.groupby('Name')['change'].tail(1).index, -1] = 1.0        
#Set End_Date as Termination_Date for those changing points
df['Termination_Date'] = df.apply(lambda x: x.End_Date if x.change>0 else np.nan, axis=1)
#Set Actual_Start
df['Actual_Start'] = df.apply(lambda x: x.Start_Date if (x.name==0 
                                                      or x.Name!= 
df.iloc[x.name-1]['Name'] 
                                                      or df.iloc[x.name-1]['change']>0) 
                                                 else np.nan, axis=1)
#back fill the Termination_Date for other rows.
df.Termination_Date.fillna(method='bfill', inplace=True)
#forward fill the Actual_Start for other rows.
df.Actual_Start.fillna(method='ffill', inplace=True)
print(df)

我收到以下错误:

 File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1554, in _is_valid_list_like
raise IndexError("positional indexers are out-of-bounds")

索引错误!

IndexError: positional indexers are out-of-bounds
4

1 回答 1

0

你可能覆盖了你的df某个地方:

tsv = """Name    Sensex_index    Start_Date       End_Date
AAA        0.5           20/08/2016    25/09/2016 
AAA        0.8           26/08/2016    29/08/2016 
AAA        0.4           30/08/2016    31/08/2016
AAA        0.9           01/09/2016    05/09/2016
AAA        0.5           12/09/2016    22/09/2016
AAA        0.3           24/09/2016    29/09/2016
ABC        0.9           01/01/2017    15/01/2017
ABC        0.5           23/01/2017    30/01/2017
ABC        0.7           02/02/2017    15/03/2017
"""

df=pd.read_table(io.StringIO(tsv), sep="\s+")

然后我复制粘贴了你的代码并且没有收到错误,但是这个df

  Name  Sensex_index  Start_Date    End_Date  change Termination_Date  \
0  AAA           0.5  20/08/2016  25/09/2016     NaN       31/08/2016   
1  AAA           0.8  26/08/2016  29/08/2016     0.0       31/08/2016   
2  AAA           0.4  30/08/2016  31/08/2016     1.0       31/08/2016   
3  AAA           0.9  01/09/2016  05/09/2016     0.0       29/09/2016   
4  AAA           0.5  12/09/2016  22/09/2016     0.0       29/09/2016   
5  AAA           0.3  24/09/2016  29/09/2016     1.0       29/09/2016   
6  ABC           0.9  01/01/2017  15/01/2017     NaN       30/01/2017   
7  ABC           0.5  23/01/2017  30/01/2017     1.0       30/01/2017   
8  ABC           0.7  02/02/2017  15/03/2017     1.0       15/03/2017   

  Actual_Start  
0   20/08/2016  
1   20/08/2016  
2   20/08/2016  
3   01/09/2016  
4   01/09/2016  
5   01/09/2016  
6   01/01/2017  
7   01/01/2017  
8   02/02/2017

只需重新创建您的数据框,您应该会很好。

于 2017-07-07T15:35:00.380 回答