4

长期读者,第一次海报。

我正在尝试对两个非常大的 SpatialPolygonsDataFrame 对象执行 gIntersection() 。第一个是所有美国县,第二个是 240 行 x 279 列网格,作为一系列 66,960 个多边形。

我仅使用宾夕法尼亚州和与 PA 重叠的网格部分成功地运行了这个:

gIntersection(PA, grid, byid=TRUE)

我试图在一夜之间为整个美国运行它,今天早上它仍在运行,我的硬盘上有一个 10 GB(!)的交换文件,没有任何进展的迹象。我做错了什么,还是这是正常行为,我应该做一个逐个状态的循环?

谢谢!

4

3 回答 3

4

比我希望的要晚一点,但这是我最终用于与此相关的任务的功能。它可能适用于其他应用程序。

@mdsumner 是正确的,丢弃不相交的高级操作大大加快了这一速度。希望这是有用的!

library("sp")
library("rgeos")
library("plyr")

ApportionPopulation <- function(AdminBounds, poly, Admindf) { # I originally wrote this function to total the population that lies within each polygon in a SpatialPolygon object. AdminBounds is a SpatialPolygon for whatever administrative area you're working with; poly is the SpatalPolygon you want to total population (or whatever variable of your choice) across, and Admindf is a dataframe that has data for each polygon inside the AdminBounds SpatialPolygon.

  # the AdminBounds have the administrative ID code as feature IDS. I set that up using spChFID()

  # start by trimming out areas that don't intersect

  AdminBounds.sub <- gIntersects(AdminBounds, poly, byid=TRUE) # test for areas that don't intersect
  AdminBounds.sub2 <- apply(AdminBounds.sub, 2, function(x) {sum(x)}) # test across all polygons in the SpatialPolygon whether it intersects or not
  AdminBounds.sub3 <- AdminBounds[AdminBounds.sub2 > 0] # keep only the ones that actually intersect

  # perform the intersection. This takes a while since it also calculates area and other things, which is why we trimmed out irrelevant areas first

  int <- gIntersection(AdminBounds.sub3, poly, byid=TRUE) # intersect the polygon and your administrative boundaries

  intdf <- data.frame(intname=names(int)) # make a data frame for the intersected SpatialPolygon, using names from the output list from int
  intdf$intname <- as.character(intdf$intname) # convert the name to character
  splitid <- strsplit(intdf$intname, " ", fixed=TRUE) # split the names
  splitid <- do.call("rbind", splitid) # rbind those back together
  colnames(splitid) <- c("adminID", "donutshpid") # now you have the administrative area ID and the polygonID as separate variables in a dataframe that correspond to the int SpatialPolygon.
  intdf <- data.frame(intdf, splitid) # make that into a dataframe
  intdf$adminID <- as.character(intdf$adminID) # convert to character
  intdf$donutshpid <- as.character(intdf$donutshpid) # convert to character. In my application the shape I'm using is a series of half-circles

  # now you have a dataframe corresponding to the intersected SpatialPolygon object

  intdf$polyarea <- sapply(int@polygons, function(x) {x@area}) # get area from the polygon SP object and put it in the df
  intdf2 <- join(intdf, Admindf, by="adminID") # join together the two dataframes by the administrative ID
  intdf2$popinpoly <- intdf2$pop * (intdf2$polyarea / intdf2$admin_area) # calculate the proportion of the population in the intersected area that is within the bounds of the polygon (assuming the population is evenly distributed within the administrative area)
  intpop <- ddply(intdf2, .(donutshpid), summarize, popinpoly=sum(popinpoly)) # sum population lying within each polygon

  # maybe do other final processing to get the output in the form you want

  return(intpop) # done!
}
于 2013-07-17T15:25:04.183 回答
1

我发现这个sf包更胜一筹:

out <- st_intersection(grid, polygons)

gIntersection将我的计算机锁定了几个小时试图运行,因此需要修剪或循环通过单个多边形,st_intersectionsf包中运行我的数据在几秒钟内。

st_intersection还会自动合并两个输入的数据框。

感谢塔斯马尼亚大学的格兰特威廉姆森的小插图:https ://atriplex.info/blog/index.php/2017/05/24/polygon-intersection-and-summary-with-sf/

于 2019-03-27T17:58:47.923 回答
0

rasterize使用包中raster的网格作为栅格,您可能会更快地得到答案。它有一个参数用于查找与单元格重叠的多边形数量。

 ?rasterize
 getCover: logical. If ‘TRUE’, the fraction of each grid cell that is
      covered by the polygons is returned (and the values of
      ‘field, fun, mask’, and ‘update’ are ignored. The fraction
      covered is estimated by dividing each cell into 100 subcells
      and determining presence/absence of the polygon in the center
      of each subcell

看起来您无法控制子单元的数量,尽管这可能并不难打开。

于 2013-06-04T20:22:57.643 回答