I wonder how the KDE function in R is implemented, because I saw some weird stuff in my plots.
dates5.csv is nothing more than:
day
2013-01-02
2013-03-01
i.e. two dates. Now I read in the data, compute rectangular KDE out of if, and get the plot below:
data <- read.csv("dates5.csv", header=T)
days <- data$day
daysPosix <- as.POSIXct(days, tz="Europe/Zurich")
# compute density
ds <- density(as.numeric(daysPosix),
bw = 3600 * 24 * 7,
kernel = "rectangular",
cut = 3)
plot(ds, xaxt ="n", xlab="", ylab="", ylim = c(0,max(ds$y)),
main = "Temporal density (uniform kernel,
bandwidth = 7 days)")
points(x = as.numeric(daysPosix),
y=rep(0, length(daysPosix)),
pch="|",
col="#00000080")
times.seq <- seq(daysPosix[1],
daysPosix[length(daysPosix)],
by = "weeks")
labels = strftime(times.seq, "%d.%m.%y")
axis(1,times.seq,labels)
The tick marks at the x axis are separated by weeks. At the first sight, the plot makes sense, two rectancular shapes are built on top of the two points. Still, there are two things which I don't understand: Why is the approximate range of each "shape" a bit more than 3 weeks long, and not as expected 7 days (since this is the bandwidth?)? And why do the shapes have steep "cliffs" on both sides but not vertical ones?