0

我有两个数据框,一个带有我的船 GPS 位置(5512 条记录),一个带有渔船位置(35381 条记录)。我想计算我的船与当天同一时间(到分钟)在该地区的所有其他渔船之间的距离。

我为所有位置创建了一个 IDdatecode (yyyymmddhhmm),然后基于相同的 IDdatecode 合并了两个数据帧。我这样做了:

merged_table<- merge(myboat,fishboats,by="IDdatecode",all.y=TRUE)

为了计算距离,我使用了以下公式:

merged_table$distance_between_vessels=distm(c("lon1","lat1"),c("lon2","lat2"),fun=distGeo)

其中 lon1, lat1 是我的船位, lon2, lat2 是渔船。

但我收到以下错误:

Error in `$<-.data.frame`(`*tmp*`, "distance_between_vessels", value = NA_real_) : 
  replacement has 1 row, data has 35652
In addition: Warning messages:
1: In .pointsToMatrix(x) : NAs introduced by coercion
2: In .pointsToMatrix(y) : NAs introduced by coercion

到目前为止我尝试的是:

  1. 使用这个其他公式:merged_table$distance_between_vessels=distGeo(c("lon1","lat1"),c("lon2","lat2"))
  2. 把纬度和经度的所有列“as.numeric”
  3. 仅使用我的船和渔船都在场的间隔时间
  4. 忽略警告并继续前进

但我仍然只得到一个 NA 列表。

我在一个更简单的数据集(仅我的船位置)中使用了函数“distGeo”,在该数据集中我手动计算了第一点和第二点之间的距离,然后是第二点和第三点之间的距离,依此类推。该功能完美运行,因为它为我提供了两点之间的正确距离(我在 ArcGIS 上检查过)。这就是我所做的:

distGeo(mydata[1, ], mydata[2, ])
distGeo(mydata[2, ], mydata[3, ])
distGeo(mydata[3, ], mydata[4, ])

因此,我想根据一天中的唯一时间计算“一对多”距离,但出现上述错误。关于为什么的任何想法?谢谢 :)

在这里,我合并表的前 10 行:

structure(list(Record = 1:10, IDdatecode = structure(c(1L, 2L, 
3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L), .Label = c("d201805081203", 
"d201805081204", "d201805081205", "d201805081206", "d201805081207", 
"d201805081208"), class = "factor"), lon1 = c(12.40203333, 12.4071, 
12.41165, 12.41165, 12.41485, 12.41485, 12.41663333, 12.41663333, 
12.41841667, 12.41841667), lat1 = c(45.1067, 45.10921667, 45.11218333, 
45.11218333, 45.11303333, 45.11303333, 45.11313333, 45.11313333, 
45.11348333, 45.11348333), boat1 = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = "RB", class = "factor"), lon2 = c(13.02718, 
13.02585827, 13.02453654, 13.02173, 13.02321482, 13.02052301, 
13.02189309, 13.01931602, 13.02057136, 13.01810904), lat2 = c(44.98946, 
44.99031749, 44.99117498, 44.98792, 44.99203246, 44.98868065, 
44.99288995, 44.98944129, 44.99374744, 44.99020194), boat2 = structure(c(1L, 
1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("IMPERO II", 
"MISTRAL"), class = "factor")), .Names = c("Record", "IDdatecode", 
"lon1", "lat1", "boat1", "lon2", "lat2", "boat2"), row.names = c(NA, 
-10L), class = "data.frame")
4

1 回答 1

0

V2,更新(2022 年 1 月 17 日)

很高兴它对你有用。如果您愿意避免for-loops,您可以考虑一种dplyr方法。看看

  library(dplyr)
  
  df <- silvia %>%
    rowwise() %>% 
    mutate(distance = geosphere::distGeo(c(lon1, lat1), c(lon2, lat2)))
  df

-familybase R **apply将是另一种选择。


V1(2022 年 1 月 16 日)

希望这种方法对您有所帮助。通常建议不要使用 for 循环。但是,我使用了一个,因为它们很容易理解。

我做了以下假设:

  • boat1是你的船。lat1lon1代表boat1任何位置IDdatecode
  • 因为我不完全理解“基于一天中的唯一时间”的意思,所以我认为循环遍历每一行就足够了;
  • 该功能distGeo()来自geosphere包。
# loading your dataframe as "silvia"
silvia <- structure(list(Record = 1:10, IDdatecode = structure(c(1L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L),
          .Label = c("d201805081203","d201805081204", "d201805081205", "d201805081206", "d201805081207", "d201805081208"),
          class = "factor"), lon1 = c(12.40203333, 12.4071, 12.41165, 12.41165, 12.41485, 12.41485, 12.41663333, 
          12.41663333, 12.41841667, 12.41841667), lat1 = c(45.1067, 45.10921667, 45.11218333, 45.11218333, 45.11303333, 
          45.11303333, 45.11313333, 45.11313333, 45.11348333, 45.11348333), boat1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
          1L, 1L, 1L), .Label = "RB", class = "factor"), lon2 = c(13.02718, 13.02585827, 13.02453654, 13.02173, 13.02321482,
          13.02052301, 13.02189309, 13.01931602, 13.02057136, 13.01810904), lat2 = c(44.98946, 44.99031749, 44.99117498, 44.98792,
          44.99203246, 44.98868065, 44.99288995, 44.98944129, 44.99374744, 44.99020194), boat2 = structure(c(1L, 1L, 1L, 2L,
          1L, 2L, 1L, 2L, 1L, 2L), .Label = c("IMPERO II", "MISTRAL"), class = "factor")), .Names = c("Record", "IDdatecode", 
          "lon1", "lat1", "boat1", "lon2", "lat2", "boat2"), row.names = c(NA, -10L), class = "data.frame")


# for EACH ROW in "silvia" calculate the distance between c("lon1", "lat1") and c("lon2", "lat2")
for (i in 1:nrow(silvia)){

  silvia$distance[i] <- geosphere::distGeo(c(silvia[i, "lon1"], silvia[i,"lat1"]), 
                                c(silvia[i, "lon2"], silvia[i,"lat2"])) 

}


# here you see the first 5 entrys of the df "silvia"
# the distances are calculated in metres 
# the parameters a and f are set to WGS84 by default.
head(silvia, n=5)
#>   Record    IDdatecode     lon1     lat1 boat1     lon2     lat2     boat2
#> 1      1 d201805081203 12.40203 45.10670    RB 13.02718 44.98946 IMPERO II
#> 2      2 d201805081204 12.40710 45.10922    RB 13.02586 44.99032 IMPERO II
#> 3      3 d201805081205 12.41165 45.11218    RB 13.02454 44.99117 IMPERO II
#> 4      4 d201805081205 12.41165 45.11218    RB 13.02173 44.98792   MISTRAL
#> 5      5 d201805081206 12.41485 45.11303    RB 13.02321 44.99203 IMPERO II
#>   distance
#> 1 50943.77
#> 2 50503.93
#> 3 50118.46
#> 4 50005.52
#> 5 49774.51

笔记。由 reprex 包于 2022-01-16 创建 (v2.0.1)

于 2022-01-16T15:46:26.537 回答