我有一个向量,其后可能跟零个或多个以“/”开头的限定符。第一个元素应该始终是一个术语。
mesh <- c("Animals", "/physiology" , "/metabolism*",
"Insects", "Arabidopsis", "/immunology" )
我想加入最后一个任期的预选赛并获得一个新的向量
Animals/physiology
Animals/metabolism*
Insects
Arabidopsis/immunology
通过grepling 为不以 a 开头的值创建/组标识符,在此组标识符上拆分,然后paste0:
unlist(by(mesh, cumsum(grepl("^[^/]",mesh)), FUN=function(x) paste0(x[1], x[-1])))
# 11 12 2 3
# "Animals/physiology" "Animals/metabolism*" "Insects" "Arabidopsis/immunology"
另一种选择是tapply
unlist(tapply(mesh, cumsum(grepl("^[^/]", mesh)),
FUN = function(x) paste0(x[1], x[-1])), use.names=FALSE)
#[1] "Animals/physiology" "Animals/metabolism*" "Insects" "Arabidopsis/immunology"
能想到比这更优雅的东西:
mesh <- c("Animals", "/physiology" , "/metabolism*",
"Insects", "Arabidopsis", "/immunology" )
#gets "prefixes", assuming they all start with a letter:
pre <- grep(pattern = "^[[:alpha:]]", x = mesh)
#gives integer IDs for the prefix-suffix groupings
id <- rep(1:length(pre), times = diff(c(pre,length(mesh) + 1)))
#function that pastes the first term in vector to any remaining ones
#will just return first term if there are no others
combine <- function(x) paste0(x[1], x[-1])
#groups mesh by id, then applies combine to each group
results <- tapply(mesh, INDEX = id, FUN = combine)
unlist(results)