1

I wonder how the KDE function in R is implemented, because I saw some weird stuff in my plots.

dates5.csv is nothing more than:

  day
2013-01-02
2013-03-01

i.e. two dates. Now I read in the data, compute rectangular KDE out of if, and get the plot below:

  data <- read.csv("dates5.csv", header=T)
  days <- data$day

  daysPosix <- as.POSIXct(days, tz="Europe/Zurich")

# compute density
  ds <- density(as.numeric(daysPosix), 
                bw = 3600 * 24 * 7,
                kernel = "rectangular",
                cut = 3)
  plot(ds, xaxt ="n", xlab="", ylab="",  ylim = c(0,max(ds$y)),
         main = "Temporal density (uniform kernel,
                     bandwidth = 7 days)")

    points(x = as.numeric(daysPosix),
           y=rep(0, length(daysPosix)),
           pch="|",
           col="#00000080")
    times.seq <- seq(daysPosix[1],
                     daysPosix[length(daysPosix)],
                     by = "weeks")
    labels = strftime(times.seq, "%d.%m.%y")
    axis(1,times.seq,labels)

enter image description here

The tick marks at the x axis are separated by weeks. At the first sight, the plot makes sense, two rectancular shapes are built on top of the two points. Still, there are two things which I don't understand: Why is the approximate range of each "shape" a bit more than 3 weeks long, and not as expected 7 days (since this is the bandwidth?)? And why do the shapes have steep "cliffs" on both sides but not vertical ones?

4

1 回答 1

3

?density注意到:

bw要使用的平滑带宽。内核被缩放,使得这是平滑内核的标准偏差。

所以bw = 3600*24*7/sqrt(12)似乎给出了一周宽的形状。换句话说,您需要对您的带宽进行“缩减”,这样当它被缩减时,density您就可以得到您想要的。您也可以设置adjust = 1/sqrt(12).

要使形状具有垂直下降,请增加n以提高计算分辨率,例如n = 2^15.

因此,将您的density电话更改为:

  ds <- density(as.numeric(daysPosix), 
                bw = 3600 * 24 * 7 / sqrt(12),
                kernel = "rectangular",
                cut = 3, n=2^15)

并检查形状的宽度:

which(abs(diff(ds$y))>max(ds$y)/2) # approximate locations of the edges
[1]  1197  4469 28299 31571
(ds$x[4469]-ds$x[1197])/(3600*24*7)
[1] 1.00034
于 2013-05-29T14:28:45.730 回答