python - 数据透视表到 Python/Pandas 中的扩展数据框

Question

我想在我以前的问题的基础上再接再厉。

让我们看一些 Python 代码。

import numpy as np
import pandas as pd
mat = np.array([[1,2,3],[4,5,6]])
df_mat = pd.DataFrame(mat)
df_mat_tidy = (df_mat.stack()
                    .rename_axis(index = ['V1','V2'])
                    .rename('value')
                    .reset_index()
                    .reindex(columns = ['value','V1','V2']))
df_mat_tidy

这将我从数据透视表（垫子）带到数据的“整洁”（在 Tidyverse 意义上）版本，其中一个变量作为数字来自的列，一个变量作为数字来自的行，并且一个变量作为数据透视表中行列位置的数字。

现在我想对此进行扩展，以使行列对重复数据透视表指定的次数。换句话说，如果位置 1,1 的值为 3，位置 2,1 的值为 4，我希望数据框可以

代替

col row value
 1   1    3
 1   2    4

我想我知道如何遍历第二个示例的行并生成它，但我想要更快的东西。

有没有办法按照我描述的方式“融化”数据透视表？

score 0 · Accepted Answer

您可以从理解中重建 DataFrame：

pd.DataFrame([i for j in [[[rec['V1'], rec['V2']]] * rec['value']
                  for rec in df_mat_tidy.to_dict(orient='records')]
          for i in j], columns=['col', 'row'])

它按预期给出：

    col  row
0     0    0
1     0    1
2     0    1
3     0    2
4     0    2
5     0    2
6     1    0
7     1    0
8     1    0
9     1    0
10    1    1
11    1    1
12    1    1
13    1    1
14    1    1
15    1    2
16    1    2
17    1    2
18    1    2
19    1    2
20    1    2

score 0 · Accepted Answer

看看pandas' 文档中标题为“重塑和数据透视表”的部分。

和都是现有的功能.pivot()。看起来你正在重新发明一些轮子。.pivot_table().melt()

python - 数据透视表到 Python/Pandas 中的扩展数据框

2 回答 2

Related

Reference