python - Pandas：将具有重复行名的数据重塑为列

Question

我有一个有点像这样的数据集（显示的第一行）：

Sample  Detector        Cq
P_1   106    23.53152
P_1   106    23.152458
P_1   106    23.685083
P_1   135        24.465698
P_1   135        23.86892
P_1   135        23.723469
P_1   17  22.524242
P_1   17  20.658733
P_1   17  21.146122

“Sample”和“Detector”列都包含重复值（“Cq”是唯一的）：准确地说，每个“Detector”对于每个样本出现 3 次，因为它是数据中的重复。

我需要做的是：

重塑表格，使列包含样本和行检测器
重命名重复的列，以便我知道它是哪个副本

我认为这DataFrame.pivot可以解决问题，但由于重复数据而失败。最好的方法是什么？重命名重复项，然后重塑，还是有更好的选择？

编辑：我考虑了一下，我认为最好说明目的。我需要为每个“样本”存储其“检测器”的平均值和标准偏差。

score 6 · Accepted Answer

看起来您可能正在寻找的是分层索引数据框[link]。

像这样的东西会起作用吗？

#build a sample dataframe
a=['P_1']*9
b=[106,106,106,135,135,135,17,17,17]
c = np.random.randint(1,100,9)
df = pandas.DataFrame(data=zip(a,b,c), columns=['sample','detector','cq'])

#add a repetition number column
df['rep_num']=[1,2,3]*( len(df)/3 )

#Convert to a multi-indexed DF
df_multi = df.set_index(['sample','detector','rep_num'])

#--------------Resulting Dataframe---------------------

                             cq
sample detector rep_num    
P_1    106      1        97
                2        83
                3        81
       135      1        46
                2        92
                3        89
       17       1        58
                2        26
                3        75

python - Pandas：将具有重复行名的数据重塑为列

1 回答 1

Related

Reference