这个问题的答案清楚地解释了如何在通过 dplyr 管道运行回归时按组检索整齐的回归结果,但解决方案不再可重现。
如何结合使用 dplyr 和 broom 来按组运行回归并使用 R 4.02、dplyr 1.0.0 和 broom 0.7.0 检索整齐的结果?
具体来说,上面链接的问题的示例答案,
library(dplyr)
library(broom)
df.h = data.frame(
hour = factor(rep(1:24, each = 21)),
price = runif(504, min = -10, max = 125),
wind = runif(504, min = 0, max = 2500),
temp = runif(504, min = - 10, max = 25)
)
dfHour = df.h %>% group_by(hour) %>%
do(fitHour = lm(price ~ wind + temp, data = .))
# get the coefficients by group in a tidy data_frame
dfHourCoef = tidy(dfHour, fitHour)
当我在我的系统上运行它时返回以下错误(和三个警告):
Error in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) :
Calling var(x) on a factor x is defunct.
Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
In addition: Warning messages:
1: Data frame tidiers are deprecated and will be removed in an upcoming release of broom.
2: In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA
3: In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA
如果我重新格式化df.h$hour
为一个字符而不是因子,
df.h <- df.h %>%
mutate(
hour = as.character(hour)
)
按组重新运行回归,并再次尝试使用检索结果broom::tidy
,
dfHour = df.h %>% group_by(hour) %>%
do(fitHour = lm(price ~ wind + temp, data = .))
# get the coefficients by group in a tidy data_frame
dfHourCoef = tidy(dfHour, fitHour)
我收到此错误:
Error in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) :
is.atomic(x) is not TRUE
我认为问题与组级回归结果作为列表存储在 中的事实有关dfHour$fitHour
,但我不确定如何纠正错误并再次整齐快速地编译回归结果,就像在最初发布的代码/答案。