-1

我知道这听起来很荒谬,但我必须将 for 循环传递给函数。我有一个包含 75 列以上的数据框,其中大多数是分类变量。其中一个变量被调用SalePrice,我希望找到分类变量和 之间的相关性SalePrice

这是我的代码,但我认为手动浏览所有 75 列是荒谬的。有没有简单的方法?

df = pd.read_csv(file, delimiter=',')
qualityTest = df[["OverallQual","SalePrice"]]
qualities = [1,2,3,4,5,6,7,8,9,10]
stats.f_oneway(qualityTest['SalePrice'][qualityTest['OverallQual'] == 1],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 2],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 3],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 4],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 5],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 6],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 7],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 8],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 9],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 10])

我试过这样做,但它不起作用

stats.f_oneway(
    for i in qualities:
        qualityTest['SalePrice'][qualityTest['OverallQual'] == i]
)
4

2 回答 2

5

您可以使用列表推导 - 本质上,使用for循环创建一个列表,然后将其传入:

stats.f_oneway([qualityTest['salePrice'][qualityTest['OverallQual'] == i] for i in qualities])

或者,如果您希望它作为i单独的参数传递,而不是作为一个带有ielements*的列表传递,您可以在最外面的一组方括号前面添加一个右括号(这会将您刚刚制作的列表解压缩为函数参数)。

于 2019-07-01T03:41:09.490 回答
3

groupby在这里使用

qualityTest.groupby('OverallQual').OverallQual.apply(stats.f_oneway)
于 2019-07-01T03:44:51.873 回答