2

我收到了一些纬度、经度和计数格式的客户数据。创建 ggplot 热图所需的所有数据都存在,但我不知道如何将其放入 ggplot 所需的格式。

我试图通过 0.01 Lat 和 0.01 Lon 块(典型的热图)内的总计数来聚合数据,我本能地认为“点击”。这会根据需要按块大小创建一个很好的摘要,但格式错误。此外,我真的很想将空的 Lat 或 Lon 块值包含为零,即使那里什么都没有……否则热图最终看起来很奇怪。

我在下面的代码中创建了我的数据子集供您参考:

# m is the matrix of data provided
m = matrix(c(44.9591051,44.984884,44.984884,44.9811399,
           44.9969096,44.990894,44.9797023,44.983334,
          -93.3120017,-93.297668,-93.297668,-93.2993524,
          -93.2924484,-93.282462,-93.2738911,-93.26667,
          69,147,137,22,68,198,35,138), nrow=8, ncol=3) 
colnames(m) <- c("Lat", "Lon", "Count")
m <- as.data.frame(m)
s = as.data.frame((tapply(m$Count, list(round(m$Lon,2), round(m$Lat,2)), sum)))
s[is.na(s)] <- 0

# Data frame "s" has all the data, but not exactly in the format desired...
# First, it has a column for each latitude, instead of one column for Lon
# and one for Lat, and second, it needs to have 0 as the entry data for 
# Lat / Lon pairs that have no other data. As it is, there are only zeroes
# when one of the other entries has a Lat or Lon that matches... if there
# are no entries for a particular Lat or Lon value, then nothing at all is
# reported.

desired.format = matrix(c(44.96,44.96,44.96,44.96,44.96,
    44.97,44.97,44.97,44.97,44.97,44.98,44.98,44.98,
    44.98,44.98,44.99,44.99,44.99,44.99,44.99,45,45,
    45,45,45,-93.31,-93.3,-93.29,-93.28,-93.27,-93.31,
    -93.3,-93.29,-93.28,-93.27,-93.31,-93.3,-93.29,
    -93.28,-93.27,-93.31,-93.3,-93.29,-93.28,-93.27,
    -93.31,-93.3,-93.29,-93.28,-93.27,69,0,0,0,0,0,0,
    0,0,0,0,306,0,0,173,0,0,0,198,0,0,0,68,0,0),
    nrow=25, ncol=3)

colnames(desired.format) <- c("Lat", "Lon", "Count")
desired.format <- as.data.frame(desired.format)

minneapolis = get_map(location = "minneapolis, mn", zoom = 12)
ggmap(minneapolis) + geom_tile(data = desired.format, aes(x = Lon, y = Lat, alpha = Count), fill="red")
4

1 回答 1

3

这是一个带有 geom_hex 和 stat_density2d 的刺。通过截断坐标来制作 bin 的想法让我有点不安。

您所拥有的是计数数据,并给出了纬度/经度,这意味着理想情况下您需要一个权重参数,但据我所知,这不是用 geom_hex 实现的。相反,我们通过按计数变量重复行来破解它,类似于这里的方法。

  ## hack job to repeat records to full count
  m<-as.data.frame(m)
  m_long <- with(m, m[rep(1:nrow(m), Count),])


  ## stat_density2d
  ggplot(m_long, aes(Lat, Lon)) + 
  stat_density2d(aes(alpha=..level.., fill=..level..), size=2, 
                 bins=10, geom=c("polygon","contour")) + 
  scale_fill_gradient(low = "blue", high = "red") +
  geom_density2d(colour="black", bins=10) +
  geom_point(data = m_long)


  ## geom_hex alternative
  bins=6
  ggplot(m_long, aes(Lat, Lon)) + 
  geom_hex(bins=bins)+
  coord_equal(ratio = 1/1)+
  scale_fill_gradient(low = "blue", high = "red") +
  geom_point(data = m_long,position = "jitter")+
  stat_binhex(aes(label=..count..,size=..count..*.5), size=3.5,geom="text", bins=bins, colour="white")

这些分别产生以下内容: 在此处输入图像描述 以及分箱版本: 在此处输入图像描述

编辑:

带底图:

map + 
  stat_density2d(data = m_long, aes(x = Lon, y = Lat,
alpha=..level.., fill=..level..), 
                 size=2, 
                 bins=10, 
                 geom=c("polygon","contour"),
                 inherit.aes=FALSE) + 
  scale_fill_gradient(low = "blue", high = "red") +
  geom_density2d(data = m_long, aes(x = Lon, y=Lat),
                 colour="black", bins=10,inherit.aes=FALSE) +
  geom_point(data = m_long, aes(x = Lon, y=Lat),inherit.aes=FALSE)


## and the hexbin map...

map + #ggplot(m_long, aes(Lat, Lon)) + 
  geom_hex(bins=bins,data = m_long, aes(x = Lon, y = Lat),alpha=.5,
                 inherit.aes=FALSE) + 
  geom_point(data = m_long, aes(x = Lon, y=Lat),
             inherit.aes=FALSE,position = "jitter")+
  scale_fill_gradient(low = "blue", high = "red")

在此处输入图像描述 在此处输入图像描述

于 2014-07-07T16:46:11.677 回答