我想生成数据以显示分位数回归如何有用,特别是当线性回归显示两个变量之间没有(平均)相关性,但数据的上分位数和/或下分位数存在相关性时。
可以在下图中找到此行为:
但我想用对我的研究领域有意义的数据来重现这种行为,这恰好是住房经济学。
我想生成数据以显示分位数回归如何有用,特别是当线性回归显示两个变量之间没有(平均)相关性,但数据的上分位数和/或下分位数存在相关性时。
可以在下图中找到此行为:
但我想用对我的研究领域有意义的数据来重现这种行为,这恰好是住房经济学。
我想我找到了一种方法。我不知道这是否是最好的方法,但我做到了:使用truncnorm
包可以创建截断的正常数据,我可以用它创建这个图表,这对我来说似乎很好。
set.seed(5)
x = rtruncnorm(500, a = 125, b = 600, mean = 360, sd = 100)
b0 = 2500 # intercept chosen at your choice
b1 = 0 # coef chosen at your choice
h = function(x) 250 - .25*x # h performs heteroscedasticity function (here I used a linear one)
i = function(x) b0 + .25*x # lower bound limit
y = rtruncnorm(500, a = i(x), mean = b0, sd = h(x))
#y = b0 + b1*x + eps - h(x)
dados <- data.frame(VU = y, Area = x)
#X <- X[which(X[, "y"] > 2400), ]
fit <- lm(VU ~ Area, data = dados)
fit1 <- rq(VU ~ Area, tau = .10, data = dados)
fit2 <- rq(VU ~ Area, tau = .25, data = dados)
fit3 <- rq(VU ~ Area, tau = .50, data = dados)
fit4 <- rq(VU ~ Area, tau = .75, data = dados)
fit5 <- rq(VU ~ Area, tau = .90, data = dados)
fit6 <- rq(VU ~ Area, tau = .95, data = dados)
par(mar=c(5,5,2,6), bg = "light blue")
plot(VU ~ Area, data = dados, main = "A importância de analisar todos os quantis",
xlab = expression("Área"~(m^2)),
ylab = expression("VU (R$/"~m^2~")"))
abline(fit, lty = 2, lwd = 2)
abline(fit1, col = "yellow")
abline(fit2, col = "red")
abline(fit3, col = "green")
abline(fit4, col = "blue")
abline(fit5, col = "darkred")
abline(fit6, col = "darkblue")
legend('topleft', xpd = TRUE, inset=c(1,0), bty = "n",
legend=c(".10", ".25", ".50", ".75", ".90", "0.95", "mean"),
col = c("yellow", "red", "green", "blue", "darkred", "darkblue", "black"),
lty = c(rep(1, 6), 2),
lwd = c(rep(1,6), 2),
title= expression(~tau), text.font=4, bg='lightblue')