pivot_table 有效,但为了便于阅读,我使用了速记版本。
data = [['nucleus', 1790, '2012-10-01 00:00:00'],
['neuron', 364, '2012-10-02 00:00:00'],
['current', 280, '2012-10-02 00:00:00'],
['molecular', 259, '2012-10-02 00:00:00'],
['stem', 201, '2012-10-02 00:00:00']]
df = pd.DataFrame(data, columns=['q_string', 'q_visits', 'q_date'])
q_string q_visits q_date
0 nucleus 1790 2012-10-01 00:00:00
1 neuron 364 2012-10-02 00:00:00
2 current 280 2012-10-02 00:00:00
3 molecular 259 2012-10-02 00:00:00
4 stem 201 2012-10-02 00:00:00
将 q_string 和 q_date 都分配给索引:
df.set_index(['q_string', 'q_date'], inplace=True)
索引现在看起来像这样:
MultiIndex(levels=[['current', 'molecular', 'neuron', 'nucleus', 'stem'],
['2012-10-01 00:00:00', '2012-10-02 00:00:00']],
labels=[[3, 2, 0, 1, 4], [0, 1, 1, 1, 1]],
names=['q_string', 'q_date'])`
q_string 和 q_date 都是日期的索引,我们只需 unstack() 将 q_date 放入列中。
df.unstack()
q_visits
q_date 2012-10-01 00:00:00 2012-10-02 00:00:00
q_string
current NaN 280.0
molecular NaN 259.0
neuron NaN 364.0
nucleus 1790.0 NaN
stem NaN 201.0