我有一个可以为我提供不同巴士路线 ( ) 的data.table
位置 (origin
和) 之间的连接。destination
route_id
library(data.table)
library(magrittr)
# data for reproducible example
dt <- data.table( origin = c('A','B','C', 'F', 'G', 'H'),
destination = c('B','C','D', 'G', 'H', 'I'),
freq = c(2,2,2,10,10,10),
route_id = c(1,1,1,2,2,2), stringsAsFactors=FALSE )
# > dt
# origin destination freq route_id
# 1: A B 2 1
# 2: B C 2 1
# 3: C D 2 1
# 4: F G 10 2
# 5: G H 10 2
# 6: H I 10 2
出于我想要做的目的,如果有一个route_id
提供一个连接A-B
和一个连接的连接B-C
,那么我想在数据中添加一个A-C
相同的连接route_id
等等。
问题:到目前为止,我已经创建了一个简单的代码来完成这项工作,但是:
我的缓慢解决方案
# loop
# a) get a data subset corresponding to each route_id
# b) get all combinations of origin-destination pairs
# c) row bind the new pairs to original data
for (i in unique(dt$route_id)) {
temp <- dt[ route_id== i,]
subset_of_pairs <- expand.grid(temp$origin, temp$destination) %>% setDT()
setnames(subset_of_pairs, c("origin", "destination"))
dt <- rbind(dt, subset_of_pairs, fill=T)
}
# assign route_id and freq to new pairs
dt[, route_id := route_id[1L], by=origin]
dt[, freq := freq[1L], by=route_id]
# Keepe only different pairs that are unique
dt[, origin := as.character(origin) ][, destination := as.character(destination) ]
dt <- dt[ origin != destination, ][order(route_id, origin, destination)]
dt <- unique(dt)
期望的输出
origin destination freq route_id
1: A B 2 1
2: A C 2 1
3: A D 2 1
4: B C 2 1
5: B D 2 1
6: C D 2 1
7: F G 10 2
8: F H 10 2
9: F I 10 2
10: G H 10 2
11: G I 10 2
12: H I 10 2