2

所以我试图计算分位数回归并绘制结果由于ggplot 某种原因在绘制结果时由于某种原因无法显示虚拟变量ggplot

示例mtcars数据集上的代码,如果如下所示:

library(dplyr)
library(ggplot2)
library(qre)
library(quantreg)
library(fastDummies)

dataset <- mtcars
dataset <- dummy_cols(dataset, select_columns = "gear")
dataset


rq(data=dataset,
   tau= 1:9/10,
   formula = hp ~  disp + mpg + qsec + gear_4  + gear_5) %>% 
  broom::tidy() %>% 
  #filter(term!="(Intercept)") %>%
  ggplot(aes(x=tau,y=estimate))+
  geom_point(color="#27408b", size = 3)+ 
  geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#27408b")+
  geom_line(color="#27408b", size = 1)+ 
  geom_smooth(method=  "lm", colour = "red", se = T)+  
  my_theme + 
  facet_wrap(~term,scales="free",ncol=2)

QR.2 <- rq(hp ~ disp + mpg + qsec + gear_4 + gear_5, data = dataset, tau = 1:9/10)
plot(summary(QR.2, se = "boot"))

当使用一切工作正常绘制结果时plot(summary(QR.2, se = "boot")),但由于某种原因使用 ggplot 显示错误。

4

1 回答 1

2

ggplot 没有为两gear列绘制任何内容,因为至少一个分位数的置信范围基本上是无限的。.Machine$double.xmax请注意,在下面的输出中,对于 的每个级别,至少有一个分位数置信限位于 R 的最大浮点值(1.797693e+308 或) gear

rq(data=dataset,
   tau= 1:9/10,
   formula = hp ~  disp + mpg + qsec + gear_4  + gear_5) %>% 
  broom::tidy() %>% 
  filter(grepl("gear", term)) %>% 
  arrange(term) %>% 
  as.data.frame
     term   estimate       conf.low     conf.high tau
1  gear_4   7.725539  -1.070165e+02 1.797693e+308 0.1
2  gear_4  10.295479  -3.168527e+01  1.115851e+02 0.2
3  gear_4  26.146858  -2.800967e+01  4.627397e+01 0.3
4  gear_4  20.403808  -4.757591e+01  4.244444e+01 0.4
5  gear_4 -10.288388  -3.268460e+01  4.338169e+01 0.5
6  gear_4  -7.957834  -2.606588e+01  5.368260e+01 0.6
7  gear_4  -3.902589  -2.287453e+01  6.694520e+01 0.7
8  gear_4   5.087119  -1.295842e+02  9.044733e+01 0.8
9  gear_4   4.097664 -1.797693e+308  1.199334e+02 0.9
10 gear_5  13.464949 -1.797693e+308  1.610157e+02 0.1
11 gear_5  15.969431 -1.797693e+308  9.666875e+01 0.2
12 gear_5  74.974305  -4.802727e+01  1.006461e+02 0.3
13 gear_5  57.885205  -4.215393e+01  9.898391e+01 0.4
14 gear_5  27.118007  -2.715968e+01  9.400573e+01 0.5
15 gear_5  29.105166  -2.732280e+01  1.308410e+02 0.6
16 gear_5  29.568986  -2.064172e+01  1.461912e+02 0.7
17 gear_5  43.932664  -8.733431e+00 1.797693e+308 0.8
18 gear_5 113.512563   1.982236e+01 1.797693e+308 0.9

例如,如果您将 y 添加到图表中+ coord_cartesian(ylim=c(-100,200)),这会强制 y 轴的范围很小,您会看到每个gear级别的值都出现在图表中。

这实际上也发生gear_5在自举置信区间中:

summary(QR.2) %>% 
  map_df(~ .x$coefficients %>% 
           as.data.frame %>% 
           rownames_to_column() %>% 
           mutate(tau = .x$tau)) %>% 
  filter(grepl("gear", rowname)) %>%
  arrange(term)
   rowname coefficients       lower bd      upper bd tau
1   gear_4     7.725539  -3.586683e+01  1.234368e+02 0.1
2   gear_4    10.295479  -1.596869e+01  3.471653e+01 0.2
3   gear_4    26.146858  -2.754952e+01  4.115246e+01 0.3
4   gear_4    20.403808  -2.798206e+01  4.230933e+01 0.4
5   gear_4   -10.288388  -2.440144e+01  4.221202e+01 0.5
6   gear_4    -7.957834  -2.069021e+01  4.198471e+01 0.6
7   gear_4    -3.902589  -1.967830e+01  2.471069e+01 0.7
8   gear_4     5.087119  -9.981679e+01  6.940221e+01 0.8
9   gear_4     4.097664  -2.771170e+02  1.193569e+02 0.9
10  gear_5    13.464949 -1.797693e+308  8.781396e+01 0.1
11  gear_5    15.969431  -1.617983e+01  9.571634e+01 0.2
12  gear_5    74.974305  -4.687321e+01  1.006256e+02 0.3
13  gear_5    57.885205  -2.904531e+01  9.697743e+01 0.4
14  gear_5    27.118007  -2.375592e+01  9.330493e+01 0.5
15  gear_5    29.105166  -2.523643e+01  9.464342e+01 0.6
16  gear_5    29.568986  -5.958817e+00  1.439989e+02 0.7
17  gear_5    43.932664   8.020035e+00  1.293194e+02 0.8
18  gear_5   113.512563   2.597030e+01 1.797693e+308 0.9

plot绘制对象的方法summary.rqs必须对置信带进行其他类型的处理或修剪,或者它可能正在绘制与置信带不同的东西。无论哪种方式,它所绘制的并不是输出中产生的置信带的全部范围summary(QR.2)

于 2019-12-04T21:41:13.123 回答