r - 将数据转换为常态。给定案例的最佳功能是什么？

Question

是否有允许寻找最佳（或最佳之一）变量转换的函数或包，以使模型的残差尽可能正常？

例如：

frml = formula(some_tranformation(A) ~ B+I(B^2)+B:C+C)
model = aov(formula, data=data)
shapiro.test(residuals(model))

是否有一个函数可以告诉some_transformation()优化残差正态性的函数是什么？

score 7 · Accepted Answer

你的意思是像 Box-Cox 变换？

library(car)
m0 <- lm(cycles ~ len + amp + load, Wool)
plot(m0, which=2)

在此处输入图像描述

# Box Cox Method, univariate
summary(p1 <- powerTransform(m0))
# bcPower Transformation to Normality 
# 
#    Est.Power Std.Err. Wald Lower Bound Wald Upper Bound
# Y1   -0.0592   0.0611          -0.1789           0.0606
# 
# Likelihood ratio tests about transformation parameters
#                              LRT df      pval
# LR test, lambda = (0)  0.9213384  1 0.3371238
# LR test, lambda = (1) 84.0756559  1 0.0000000


# fit linear model with transformed response:
coef(p1, round=TRUE)
summary(m1 <- lm(bcPower(cycles, p1$roundlam) ~ len + amp + load, Wool))
plot(m1, which=2)

在此处输入图像描述

score 6 · Accepted Answer

不幸的是，这在统计学中不是一个已解决的问题。用户@statquant 所建议的几乎是你能做的最好的，但它并非没有自己的陷阱。

需要注意的一件重要事情是，shapiro.test一旦您获得合理的样本量（即数百个），正态性测试对变化非常敏感，因此您不应盲目依赖它们。

我自己，我把问题扔进了太硬的篮子里。如果数据看起来至少不是正态分布的，那么我会尝试找到要在数据上运行的统计数据的非参数版本。

r - 将数据转换为常态。给定案例的最佳功能是什么？

2 回答 2

Related

Reference