我知道这听起来很荒谬,但我必须将 for 循环传递给函数。我有一个包含 75 列以上的数据框,其中大多数是分类变量。其中一个变量被调用SalePrice
,我希望找到分类变量和 之间的相关性SalePrice
。
这是我的代码,但我认为手动浏览所有 75 列是荒谬的。有没有简单的方法?
df = pd.read_csv(file, delimiter=',')
qualityTest = df[["OverallQual","SalePrice"]]
qualities = [1,2,3,4,5,6,7,8,9,10]
stats.f_oneway(qualityTest['SalePrice'][qualityTest['OverallQual'] == 1],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 2],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 3],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 4],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 5],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 6],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 7],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 8],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 9],
qualityTest['SalePrice'][qualityTest['OverallQual'] == 10])
我试过这样做,但它不起作用
stats.f_oneway(
for i in qualities:
qualityTest['SalePrice'][qualityTest['OverallQual'] == i]
)