在将栅格的值提取到点时,我发现我有几个NA
's,而不是使用函数的buffer
和fun
参数extract
,而是我想将最近的非NA
Pixel 提取到重叠的点NA
。
我正在使用基本的提取功能:
data.extr<-extract(loc.thr, data[,11:10])
这是一个不使用缓冲区的解决方案。但是,它会为数据集中的每个点分别计算距离图,因此如果数据集很大,它可能无效。
set.seed(2)
# create a 10x10 raster
r <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10)
r[] <- 1:10
r[sample(1:ncell(r), size = 25)] <- NA
# plot the raster
plot(r, axes=F, box=F)
segments(x0 = 0, y0 = 0:10, x1 = 10, y1 = 0:10, lty=2)
segments(y0 = 0, x0 = 0:10, y1 = 10, x1 = 0:10, lty=2)
# create sample points and add them to the plot
xy = data.frame(x=runif(10,1,10), y=runif(10,1,10))
points(xy, pch=3)
text(x = xy$x, y = xy$y, labels = as.character(1:nrow(xy)), pos=4, cex=0.7, xpd=NA)
# use normal extract function to show that NAs are extracted for some points
extracted = extract(x = r, y = xy)
# then take the raster value with lowest distance to point AND non-NA value in the raster
sampled = apply(X = xy, MARGIN = 1, FUN = function(xy) r@data@values[which.min(replace(distanceFromPoints(r, xy), is.na(r), NA))])
# show output of both procedures
print(data.frame(xy, extracted, sampled))
# x y extracted sampled
#1 5.398959 6.644767 6 6
#2 2.343222 8.599861 NA 3
#3 4.213563 3.563835 5 5
#4 9.663796 7.005031 10 10
#5 2.191348 2.354228 NA 2
#6 1.093731 9.835551 2 2
#7 2.481780 3.673097 3 3
#8 8.291729 2.035757 9 9
#9 8.819749 2.468808 9 9
#10 5.628536 9.496376 6 6
这是一个基于光栅的解决方案,首先用最接近的非 NA 像素值填充 NA 像素。但是请注意,这没有考虑像素内点的位置。相反,它计算像素中心之间的距离以确定最近的非 NA 像素。
首先,它为每个 NA 光栅像素计算到最近的非 NA 像素的距离和方向。下一步是计算此非 NA 单元格的坐标(假设投影 CRS),提取其值并将该值存储在 NA 位置。
起始数据:投影栅格,其值与koekenbakker的答案相同:
set.seed(2)
# set projected CRS
r <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10, crs='+proj=utm +zone=1')
r[] <- 1:10
r[sample(1:ncell(r), size = 25)] <- NA
# create sample points
xy = data.frame(x=runif(10,1,10), y=runif(10,1,10))
# use normal extract function to show that NAs are extracted for some points
extracted <- raster::extract(x = r, y = xy)
计算所有 NA 像素到最近的非 NA 像素的距离和方向:
dist <- distance(r)
# you can also set a maximum distance: dist[dist > maxdist] <- NA
direct <- direction(r, from=FALSE)
检索 NA 像素的坐标
# NA raster
rna <- is.na(r) # returns NA raster
# store coordinates in new raster: https://stackoverflow.com/a/35592230/3752258
na.x <- init(rna, 'x')
na.y <- init(rna, 'y')
# calculate coordinates of the nearest Non-NA pixel
# assume that we have a orthogonal, projected CRS, so we can use (Pythagorean) calculations
co.x <- na.x + dist * sin(direct)
co.y <- na.y + dist * cos(direct)
# matrix with point coordinates of nearest non-NA pixel
co <- cbind(co.x[], co.y[])
提取坐标为“co”的最近非 NA 单元格的值
# extract values of nearest non-NA cell with coordinates co
NAVals <- raster::extract(r, co, method='simple')
r.NAVals <- rna # initiate new raster
r.NAVals[] <- NAVals # store values in raster
用新值填充原始栅格
# cover nearest non-NA value at NA locations of original raster
r.filled <- cover(x=r, y= r.NAVals)
sampled <- raster::extract(x = r.filled, y = xy)
# compare old and new values
print(data.frame(xy, extracted, sampled))
# x y extracted sampled
# 1 5.398959 6.644767 6 6
# 2 2.343222 8.599861 NA 3
# 3 4.213563 3.563835 5 5
# 4 9.663796 7.005031 10 10
# 5 2.191348 2.354228 NA 3
# 6 1.093731 9.835551 2 2
# 7 2.481780 3.673097 3 3
# 8 8.291729 2.035757 9 9
# 9 8.819749 2.468808 9 9
# 10 5.628536 9.496376 6 6
请注意,第 5 点的值与 Koekenbakker 的答案不同,因为该方法没有考虑该点在像素内的位置(如上所述)。如果这很重要,则此解决方案可能不合适。在其他情况下,例如,如果栅格像元与点精度相比较小,则这种基于栅格的方法应该会产生良好的结果。
对于栅格堆栈,请使用上面的@koekenbakker 解决方案,并将其转换为函数。栅格堆栈的@layers
插槽是栅格列表,因此,将其覆盖并从那里开始。
#new layer
r2 <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10)
r2[] <- 1:10
r2[sample(1:ncell(r2), size = 25)] <- NA
#make the stack
r_stack <- stack(r, r2)
#a function for sampling
sample_raster_NA <- function(r, xy){
apply(X = xy, MARGIN = 1,
FUN = function(xy) r@data@values[which.min(replace(distanceFromPoints(r, xy), is.na(r), NA))])
}
#lapply to get answers
lapply(r_stack@layers, function(a_layer) sample_raster_NA(a_layer, xy))
或者是花哨的(速度改进?)
purrr::map(r_stack@layers, sample_raster_NA, xy=xy)
这让我想知道整个事情是否可以使用 dplyr 来加快速度......