0

我目前正在使用 datarium 包中的“weightloss”数据集开始运行 RMANOVA。这是输出:

dput(head(weightloss))
structure(list(id = structure(1:6, .Label = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11", "12"), class = "factor"), 
    diet = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("no", 
    "yes"), class = "factor"), exercises = structure(c(1L, 1L, 
    1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = "factor"), 
    t1 = c(10.43, 11.59, 11.35, 11.12, 9.5, 9.5), t2 = c(13.21, 
    10.66, 11.12, 9.5, 9.73, 12.74), t3 = c(11.59, 13.21, 11.35, 
    11.12, 12.28, 10.43)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

所以这是我到目前为止想出的脚本:

# Create Data Frame for Dataset:

weight <- weightloss
weight

# Pivot Longer Data to Create Factors and Scores:

weight <- weight %>% 
  pivot_longer(names_to = 'trial', # creates factor (x)
               values_to = 'value', # creates value (y)
               cols = t1:t3) # finds which cols to factor

# Plot Means in Boxplot:

ggplot(weight,
       aes(x=trial,y=value))+
  geom_boxplot()+
  labs(title = "Trial Means") # As can be predicted, inc w/time

我得到了这个看起来很正常的箱线图:

箱形图

现在是时候找出异常值并检验正态性了。

# Identify Outliers (Should be None Given Boxplot):
    
    outlier <- weight %>% 
      group_by(trial) %>% 
      identify_outliers(value)
    outlier_frame <- data.frame(outlier) 
    outlier_frame # none found :)

# Normality (Shapiro-Wilk and QQPlot):

model <- lm(value~trial,
            data = weight) # creates model
shapiro_test(residuals(model)) # measures Shapiro
ggqqplot(residuals(model))+
  labs(title = "QQ Plot of Residuals") # creates QQ

这又给了我一个非常正常的 QQplot:

QQ图

然后我通过试验包装了数据:

ggqqplot(weight, "value", ggtheme = theme_bw())+
  facet_wrap(~trial)+
labs(title = "QQPlot of Each Trial") #looks normal

从我能说的情况来看,它是正确的:

QQPLOT FACETED

但是,当我尝试按组进行 Shapiro Wilk 测试时,我一直遇到以下代码问题:

shapiro_group <- weight %>%
  group_by(trial) %>%
  shapiro_test(value)

它给了我这个错误:

错误:mutate()列有问题data。我data = map(.data$data, .f, ...)。x 必须按 中的变量分组.data

  • variable未找到列。

我也试过这个:

shapiro_test(weight, trial$value)

而是得到这个错误:

错误:不能对不存在的列进行子集化。x 列trial$value 不存在。

如果有人对原因有所了解,我将不胜感激!

4

1 回答 1

1

您收到错误的原因shapiro_test是因为它的实现中有这一行。

shapiro_test
function (data, ..., vars = NULL) 
{
....
....
 data <- data %>% gather(key = "variable", value = "value") %>% 
        filter(!is.na(value))
....
....
}

它使用gather. 由于您已经有一个名为valuethis 的列不起作用。

如果您将value列的名称更改为其他任何名称,则它可以工作。

library(dplyr)
library(rstatix)

weight %>%
  rename(value1 = value) %>%
  group_by(trial) %>%
  shapiro_test(value1)

#  trial variable statistic     p
#  <chr> <chr>        <dbl> <dbl>
#1 t1    value1       0.869 0.222
#2 t2    value1       0.910 0.440
#3 t3    value1       0.971 0.897
于 2021-09-11T03:45:19.573 回答