r - r 栅格包区域函数中的重复值

Question

嗨光栅战士！

经过数月的数据处理后，我对自己的结果感到头疼。我是 R 和空间分析的新手，但对我的学习过程非常满意。

这是我的问题：一旦我将 zonal 函数应用于一组 5 个栅格对象 (.tif)，我会以某种方式在我认为为零的区域中获得重复值。此外，一些值出现在其他单元格中。

这是我的代码：

folder <- "my_directory"

#for one country
ET<-raster("my_directory/file_ET.tif")
ED<-raster("my_directory/file_ED.tif")
SH<-raster("my_directory/file_SH.tif")
SC<-raster("my_directory/file_SC.tif") 
WH<-raster("my_directory/file_WH.tif")

#zones
lowbound<- seq(0,cellStats(ED,"max", na.rm=TRUE),1000)
upbound<-lowbound+1000
band<-upbound/1000
rclmat <- as.matrix(cbind(lowbound,upbound,band)) #to be used on reclassify
rclmat<-rclmat[1:length(rclmat[,1])-1,]
edb<- reclassify(ED, rclmat)
# zonal statistics # I already tried this: na.rm=FALSE, !anyDuplicated(sum_ET)) to avoid duplication – did not work 
sum_ET<-zonal(ET,edb,"sum") #,  na.rm=FALSE, !anyDuplicated(sum_ET))

stats<-cbind(band,upbound,sum_ET[,2], sum_SH[,2], sum_SC[,2], sum_WH[,2],perc_SH,perc_SC,perc_WH,perc_Tot)

这是我的结果：

    upbound ET
1   1000    10523272.53
2   2000    5156046.27
3   3000    5053895.54
4   4000    4796505.3
5   5000    4392162.97
6   6000    4156065.87
31  31000   10523272.53
32  32000   5156046.27
33  33000   5053895.54
34  34000   4796505.3
35  35000   4392162.97
36  36000   4156065.87

从 31 到 36 与从 1 到 6 相同。

这是我的一位同事的结果——我正在与之比较

            upbound ET
1   1000    10523272.53
2   2000    5156046.27
3   3000    5053895.54
4   4000    4796505.3
5   5000    4392162.97
6   6000    4156065.87
31  31000   0
32  32000   21247.26
33  33000   0
34  34000   0
35  35000   0
36  36000   23877.74

如您所见，我得到了重复的值。

输入可在此处找到ET、ED 文件

很感谢任何形式的帮助。非常感谢

迭戈

score 1 · Accepted Answer

首先让我通过一个简单的示例来解释您的代码中发生了什么：

# Create "zone" variable
zone=1:10

# Create ET dataframe, that does not have some zones:
dET <- data.frame(zone=c(1:5,7,8), ET=runif(7, min=1, max=10)) 
dET
#   zone       ET
# 1    1 5.776128
# 2    2 9.067579
# 3    3 9.874737
# 4    4 7.846662
# 5    5 3.588964
# 6    7 3.843509
# 7    8 8.916714

#merge zone vector and the second column from dET dataframe:
# As you can see some values in the second columnd of the result are repeated to match
# the length of the zone vector and the value that originally belonged to 7th zone was moved into 6th zone
# since there was no 6th zone in the dataframe
cbind(zone, dET[,2])
#      zone         
# [1,]    1 5.776128
# [2,]    2 9.067579
# [3,]    3 9.874737
# [4,]    4 7.846662
# [5,]    5 3.588964
# [6,]    6 3.843509
# [7,]    7 8.916714
# [8,]    8 5.776128
# [9,]    9 9.067579
# [10,]   10 9.874737

#Instead you should use merge and then fill the missing values with zeros:
result <- merge(data.frame(zone=zone), dET, by="zone", all=TRUE)
result
#    zone       ET
# 1     1 5.776128
# 2     2 9.067579
# 3     3 9.874737
# 4     4 7.846662
# 5     5 3.588964
# 6     6       NA
# 7     7 3.843509
# 8     8 8.916714
# 9     9       NA
# 10   10       NA

result$ET[is.na(result$ET)] <- 0
#    zone       ET
# 1     1 5.776128
# 2     2 9.067579
# 3     3 9.874737
# 4     4 7.846662
# 5     5 3.588964
# 6     6 0.000000
# 7     7 3.843509
# 8     8 8.916714
# 9     9 0.000000
# 10   10 0.000000

以下是如何更正代码。在这里，我只使用了您的 2 个文件 ED 和 ET。所有其他的都应该以同样的方式修改：

library(raster)
library(rgdal)

ED<-raster("file_ED.tif")
ET<-raster("file_ET.tif")

lowbound<- seq(0,cellStats(ED,"max", na.rm=TRUE),1000)
upbound<-lowbound+1000
band<-upbound/1000
rclmat <- as.matrix(cbind(lowbound,upbound,band)) #to be used on reclassify

# Commented the following line since cellStats(ED,"max", na.rm=TRUE)=35125.57
# so you should not remove the values in the range 35K-36K
#rclmat<-rclmat[1:length(rclmat[,1])-1,]

ener_dens_band<- reclassify(ED, rclmat)
sum_ET<-zonal(ET,ener_dens_band,"sum")

# Let's look at the tail of the result:
tail(sum_ET)
#       zone      sum
# [25,]   26 27011.60
# [26,]   27 53905.17
# [27,]   28 18490.15
# [28,]   29 19322.07
# [29,]   32 21247.26
# [30,]   36 23877.74

# As you can see some zones are missing: 30, 31, 33-35
# So they need to be added
sum_ET <- merge(data.frame(zone=1:36), sum_ET, by="zone", all=TRUE)
sum_ET$sum[is.na(sum_ET$sum) ] <- 0

tail(sum_ET, n=10)
#    zone      sum
# 27   27 53905.17
# 28   28 18490.15
# 29   29 19322.07
# 30   30     0.00
# 31   31     0.00
# 32   32 21247.26
# 33   33     0.00
# 34   34     0.00
# 35   35     0.00
# 36   36 23877.74

您需要对其他列重复相同的操作，然后可以使用 cbind 将它们全部放在一个数据框中。

r - r 栅格包区域函数中的重复值

1 回答 1

Related