这是基础 R 中的一种粗略方法:
拆分Word列中的每个字符dataset2,只保留元音和match它与dataset1'Letter得到对应的value和sum它。
dataset2$Sum_of_vowel_value <- sapply(strsplit(as.character(dataset2$Word), ""),
function(x) sum(dataset1$value[match(vowel[match(tolower(x), vowel)],
dataset1$Letter)], na.rm = TRUE))
dataset2
# Word Sum_of_vowel_value
#1 Wood 30
#2 Table 6
#3 Chair 10
#4 Desk 5
为了更好地理解这一点,我们可以逐步分解函数。
我们首先拆分Word成单独的字符
strsplit(as.character(dataset2$Word), "")
#[[1]]
#[1] "W" "o" "o" "d"
#[[2]]
#[1] "T" "a" "b" "l" "e"
#[[3]]
#[1] "C" "h" "a" "i" "r"
#[[4]]
#[1] "D" "e" "s" "k"
下一步是只保留元音。
sapply(strsplit(as.character(dataset2$Word), ""),
function(x) vowel[match(tolower(x), vowel)])
#[[1]]
#[1] NA "o" "o" NA
#[[2]]
#[1] NA "a" NA NA "e"
#[[3]]
#[1] NA NA "a" "i" NA
#[[4]]
#[1] NA "e" NA NA
现在对于这些元音,我们从dataset1
sapply(strsplit(as.character(dataset2$Word), ""),
function(x) dataset1$value[match(vowel[match(tolower(x), vowel)],
dataset1$Letter)])
#[[1]]
#[1] NA 15 15 NA
#[[2]]
#[1] NA 1 NA NA 5
#[[3]]
#[1] NA NA 1 9 NA
#[[4]]
#[1] NA 5 NA NA
最后,我们将所有这些值相加得到最终输出:
#[1] 30 6 10 5
数据
vowel <- c('a', 'e', 'i', 'o', 'u')
dataset1 <- data.frame(Letter = letters, value = 1:26)
dataset2 <- structure(list(Word = structure(c(4L, 3L, 1L, 2L),
.Label = c("Chair", "Desk", "Table", "Wood"), class = "factor")),
row.names = c(NA, -4L), class = "data.frame")