1

我有一个数据集,其中包含从 2019 年到 2021 年的每周计数。我想做的是将 2020 年给定一周的每周计数与 2019 年同一周的计数进行比较,并将 2021 年的计数与2019 年同一周。数据如下:

set.seed(123)
df <- data.frame(count = sample(1:300, 156, replace = TRUE),
                 week = rep(seq(1, 52, by = 1), 3),
                 year = rep(2019:2021, each = 52)) 

在我的真实数据中,存在显着的过度离散,因此我认为负二项式模型可能最适合。我已经运行了以下内容:

library(MASS)
nb <- glm.nb(count ~ factor(year)+factor(week), data = df)
summary(nb)
> summary(nb)

Call:
glm.nb(formula = count ~ factor(year) + factor(week), data = df, 
    init.theta = 2.193368056, link = log)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.5012  -0.7180  -0.0207   0.4896   1.6763  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)       5.180832   0.399664  12.963  < 2e-16 ***
factor(year)2020  0.077185   0.133587   0.578 0.563404    
factor(year)2021  0.037590   0.133609   0.281 0.778448    
factor(week)2    -0.642313   0.556042  -1.155 0.248028    
factor(week)3     0.084091   0.554446   0.152 0.879451    
factor(week)4     0.228528   0.554245   0.412 0.680103    
factor(week)5    -0.275901   0.555095  -0.497 0.619165    
factor(week)6     0.181420   0.554308   0.327 0.743448    
factor(week)7    -0.331875   0.555218  -0.598 0.550015    
factor(week)8    -0.018194   0.554608  -0.033 0.973829    
factor(week)9    -0.093659   0.554737  -0.169 0.865926    
factor(week)10   -0.260570   0.555062  -0.469 0.638753    
factor(week)11   -0.165835   0.554871  -0.299 0.765038    
factor(week)12   -0.003480   0.554583  -0.006 0.994993    
factor(week)13    0.045328   0.554506   0.082 0.934850    
factor(week)14   -0.420895   0.555429  -0.758 0.448581    
factor(week)15   -0.288260   0.555121  -0.519 0.603570    
factor(week)16   -1.719551   0.561984  -3.060 0.002215 ** 
factor(week)17   -0.339217   0.555235  -0.611 0.541237    
factor(week)18   -0.770541   0.556464  -1.385 0.166141    
factor(week)19   -0.088333   0.554728  -0.159 0.873483    
factor(week)20   -0.595712   0.555901  -1.072 0.283893    
factor(week)21   -2.010330   0.565001  -3.558 0.000374 ***
factor(week)22   -0.075819   0.554706  -0.137 0.891282    
factor(week)23    0.298783   0.554157   0.539 0.589772    
factor(week)24    0.114664   0.554401   0.207 0.836147    
factor(week)25    0.089396   0.554439   0.161 0.871907    
factor(week)26   -0.396060   0.555368  -0.713 0.475754    
factor(week)27   -0.261789   0.555065  -0.472 0.637186    
factor(week)28   -0.090157   0.554731  -0.163 0.870894    
factor(week)29    0.210589   0.554269   0.380 0.703990    
factor(week)30   -0.537967   0.555736  -0.968 0.333032    
factor(week)31   -0.401567   0.555381  -0.723 0.469651    
factor(week)32    0.108651   0.554410   0.196 0.844630    
factor(week)33   -0.732234   0.556332  -1.316 0.188113    
factor(week)34   -0.589688   0.555884  -1.061 0.288775    
factor(week)35   -0.437695   0.555471  -0.788 0.430714    
factor(week)36   -0.402218   0.555383  -0.724 0.468933    
factor(week)37   -0.076802   0.554708  -0.138 0.889881    
factor(week)38   -0.151350   0.554844  -0.273 0.785022    
factor(week)39    0.272593   0.554189   0.492 0.622806    
factor(week)40   -0.119806   0.554785  -0.216 0.829027    
factor(week)41   -1.184984   0.558260  -2.123 0.033784 *  
factor(week)42   -0.153762   0.554848  -0.277 0.781685    
factor(week)43   -0.068443   0.554693  -0.123 0.901799    
factor(week)44   -0.721053   0.556294  -1.296 0.194916    
factor(week)45    0.102378   0.554419   0.185 0.853497    
factor(week)46   -0.009142   0.554593  -0.016 0.986848    
factor(week)47   -0.284169   0.555112  -0.512 0.608712    
factor(week)48   -0.133066   0.554809  -0.240 0.810454    
factor(week)49   -0.705118   0.556242  -1.268 0.204924    
factor(week)50   -0.080921   0.554715  -0.146 0.884017    
factor(week)51    0.152016   0.554348   0.274 0.783912    
factor(week)52   -0.503605   0.555642  -0.906 0.364752    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Negative Binomial(2.1934) family taken to be 1)

    Null deviance: 221.83  on 155  degrees of freedom
Residual deviance: 169.71  on 102  degrees of freedom
AIC: 1923.7

Number of Fisher Scoring iterations: 1


              Theta:  2.193 
          Std. Err.:  0.243 

 2 x log-likelihood:  -1813.701 

参考类别factor(year)是 2019 年,这就是我想要的。但是,我正在努力解释系数(以及 IRR),week因为第 1 周是参考类别。

有没有更好的方法来做到这一点,以实现周/年比较?我的主要目标是绘制 2020 年和 2021 年相对于 2019 年的每周 IRR。

4

0 回答 0