首先使用apply
您可以添加一个带有签名股票的列(正面代表买入,负面代表卖出):
In [11]: df['signed_shares'] = df.apply(lambda row: row['nr_shares']
if row['transaction'] == 'Buy'
else -row['nr_shares'],
axis=1)
In [12]: df
Out[12]:
year month day symbol transaction nr_shares signed_shares
index
2011-01-10 2011 1 10 AAPL Buy 1500 1500
2011-01-13 2011 1 13 GOOG Sell 1000 -1000
仅使用您感兴趣的那些列并取消堆叠它们:
In [13]: df[['symbol', 'signed_shares']].set_index('symbol', append=True)
Out[13]:
signed_shares
index symbol
2011-01-10 AAPL 1500
2011-01-13 GOOG -1000
In [14]: a = df[['symbol', 'signed_shares']].set_index('symbol', append=True).unstack()
In [15]: a
Out[15]:
signed_shares
symbol AAPL GOOG
index
2011-01-10 1500 NaN
2011-01-13 NaN -1000
在您喜欢的任何日期范围内重新编制索引:
In [16]: rng = pd.date_range('2011-01-10', periods=4)
In [17]: a.reindex(rng)
Out[17]:
signed_shares
symbol AAPL GOOG
2011-01-10 1500 NaN
2011-01-11 NaN NaN
2011-01-12 NaN NaN
2011-01-13 NaN -1000
最后使用 0 填充 NaN fillna
:
In [18]: a.reindex(rng).fillna(0)
Out[18]:
signed_shares
symbol AAPL GOOG
2011-01-10 1500 0
2011-01-11 0 0
2011-01-12 0 0
2011-01-13 0 -1000
正如@DSM 指出的那样,您可以使用 [13]-[15] 做得更好pivot_table
:
In [20]: df.reset_index().pivot_table('signed_shares', 'index', 'symbol')
Out[20]:
symbol AAPL GOOG
index
2011-01-10 1500 NaN
2011-01-13 NaN -1000