0

我正在尝试agrep匹配secondsto secondand not millisecond,但似乎没有任何价值costs来实现这一点。

我特别感到困惑的是,deletions/的成本insertions似乎没有任何价值——在我看来,second是一次删除,secondsmillisecond一次删除和 5 次插入。

(警告这lapply可能需要一段时间......你会得到相同的结果length.out = 10并且0:10更快)

rng = c(seq(0, 1, length.out = 20), 0:100)
x = expand.grid(insertions = rng, substitutions = rng, deletions = rng)

units = c("millisecond", "second", "minute", "hour", "day",
          "week", "month", "quarter", "year")
x$match = lapply(seq_len(nrow(x)), function(ii)
  agrep('second', units, value = TRUE, costs = x[ii, ]))

x$match_which = sapply(x$match, paste, collapse = '|')

sort(table(x$match_which))
#             millisecond|second|minute|hour|week|month|year 
#                                                         57 
#     millisecond|second|minute|hour|week|month|quarter|year 
#                                                      13276 
#                                   millisecond|second|month 
#                                                      23316 
#                    millisecond|second|minute|month|quarter 
#                                                      37842 
#                          millisecond|second|minute|quarter 
#                                                     251480 
# millisecond|second|minute|hour|day|week|month|quarter|year 
#                                                     409865 
#                                         millisecond|second 
#                                                    1035725 

我在这里想念什么?有没有办法用 完成我的任务(匹配secondssecond不匹配millisecondagrep

4

0 回答 0