我有一个数据集,其中包含从 2019 年到 2021 年的每周计数。我想做的是将 2020 年给定一周的每周计数与 2019 年同一周的计数进行比较,并将 2021 年的计数与2019 年同一周。数据如下:
set.seed(123)
df <- data.frame(count = sample(1:300, 156, replace = TRUE),
week = rep(seq(1, 52, by = 1), 3),
year = rep(2019:2021, each = 52))
在我的真实数据中,存在显着的过度离散,因此我认为负二项式模型可能最适合。我已经运行了以下内容:
library(MASS)
nb <- glm.nb(count ~ factor(year)+factor(week), data = df)
summary(nb)
> summary(nb)
Call:
glm.nb(formula = count ~ factor(year) + factor(week), data = df,
init.theta = 2.193368056, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.5012 -0.7180 -0.0207 0.4896 1.6763
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.180832 0.399664 12.963 < 2e-16 ***
factor(year)2020 0.077185 0.133587 0.578 0.563404
factor(year)2021 0.037590 0.133609 0.281 0.778448
factor(week)2 -0.642313 0.556042 -1.155 0.248028
factor(week)3 0.084091 0.554446 0.152 0.879451
factor(week)4 0.228528 0.554245 0.412 0.680103
factor(week)5 -0.275901 0.555095 -0.497 0.619165
factor(week)6 0.181420 0.554308 0.327 0.743448
factor(week)7 -0.331875 0.555218 -0.598 0.550015
factor(week)8 -0.018194 0.554608 -0.033 0.973829
factor(week)9 -0.093659 0.554737 -0.169 0.865926
factor(week)10 -0.260570 0.555062 -0.469 0.638753
factor(week)11 -0.165835 0.554871 -0.299 0.765038
factor(week)12 -0.003480 0.554583 -0.006 0.994993
factor(week)13 0.045328 0.554506 0.082 0.934850
factor(week)14 -0.420895 0.555429 -0.758 0.448581
factor(week)15 -0.288260 0.555121 -0.519 0.603570
factor(week)16 -1.719551 0.561984 -3.060 0.002215 **
factor(week)17 -0.339217 0.555235 -0.611 0.541237
factor(week)18 -0.770541 0.556464 -1.385 0.166141
factor(week)19 -0.088333 0.554728 -0.159 0.873483
factor(week)20 -0.595712 0.555901 -1.072 0.283893
factor(week)21 -2.010330 0.565001 -3.558 0.000374 ***
factor(week)22 -0.075819 0.554706 -0.137 0.891282
factor(week)23 0.298783 0.554157 0.539 0.589772
factor(week)24 0.114664 0.554401 0.207 0.836147
factor(week)25 0.089396 0.554439 0.161 0.871907
factor(week)26 -0.396060 0.555368 -0.713 0.475754
factor(week)27 -0.261789 0.555065 -0.472 0.637186
factor(week)28 -0.090157 0.554731 -0.163 0.870894
factor(week)29 0.210589 0.554269 0.380 0.703990
factor(week)30 -0.537967 0.555736 -0.968 0.333032
factor(week)31 -0.401567 0.555381 -0.723 0.469651
factor(week)32 0.108651 0.554410 0.196 0.844630
factor(week)33 -0.732234 0.556332 -1.316 0.188113
factor(week)34 -0.589688 0.555884 -1.061 0.288775
factor(week)35 -0.437695 0.555471 -0.788 0.430714
factor(week)36 -0.402218 0.555383 -0.724 0.468933
factor(week)37 -0.076802 0.554708 -0.138 0.889881
factor(week)38 -0.151350 0.554844 -0.273 0.785022
factor(week)39 0.272593 0.554189 0.492 0.622806
factor(week)40 -0.119806 0.554785 -0.216 0.829027
factor(week)41 -1.184984 0.558260 -2.123 0.033784 *
factor(week)42 -0.153762 0.554848 -0.277 0.781685
factor(week)43 -0.068443 0.554693 -0.123 0.901799
factor(week)44 -0.721053 0.556294 -1.296 0.194916
factor(week)45 0.102378 0.554419 0.185 0.853497
factor(week)46 -0.009142 0.554593 -0.016 0.986848
factor(week)47 -0.284169 0.555112 -0.512 0.608712
factor(week)48 -0.133066 0.554809 -0.240 0.810454
factor(week)49 -0.705118 0.556242 -1.268 0.204924
factor(week)50 -0.080921 0.554715 -0.146 0.884017
factor(week)51 0.152016 0.554348 0.274 0.783912
factor(week)52 -0.503605 0.555642 -0.906 0.364752
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(2.1934) family taken to be 1)
Null deviance: 221.83 on 155 degrees of freedom
Residual deviance: 169.71 on 102 degrees of freedom
AIC: 1923.7
Number of Fisher Scoring iterations: 1
Theta: 2.193
Std. Err.: 0.243
2 x log-likelihood: -1813.701
参考类别factor(year)
是 2019 年,这就是我想要的。但是,我正在努力解释系数(以及 IRR),week
因为第 1 周是参考类别。
有没有更好的方法来做到这一点,以实现周/年比较?我的主要目标是绘制 2020 年和 2021 年相对于 2019 年的每周 IRR。