我有一长串字符串,例如这个机器可读的例子:
A <- list(c("Biology","Cell Biology","Art","Humanities, Multidisciplinary; Psychology, Experimental","Astronomy & Astrophysics; Physics, Particles & Fields","Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods","Geriatrics & Gerontology","Gerontology","Management","Operations Research & Management Science","Computer Science, Artificial Intelligence; Computer Science, Information Systems; Engineering, Electrical & Electronic","Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods; Statistics & Probability"))
所以它看起来像这样:
> A
[[1]]
[1] "Biology"
[2] "Cell Biology"
[3] "Art"
[4] "Humanities, Multidisciplinary; Psychology, Experimental"
[5] "Astronomy & Astrophysics; Physics, Particles & Fields"
[6] "Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods"
[7] "Geriatrics & Gerontology"
[8] "Gerontology"
[9] "Management"
[10] "Operations Research & Management Science"
[11] "Computer Science, Artificial Intelligence; Computer Science, Information Systems; Engineering, Electrical & Electronic"
[12] "Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods; Statistics & Probability"
我想编辑这些术语并消除重复项以获得此结果:
[1] "Science"
[2] "Science"
[3] "Arts & Humanities"
[4] "Arts & Humanities; Social Sciences"
[5] "Science"
[6] "Social Sciences; Science"
[7] "Science"
[8] "Social Sciences"
[9] "Social Sciences"
[10] "Science"
[11] "Science"
[12] "Social Sciences; Science"
到目前为止,我只得到了这个:
stringedit <- function(A)
{
A <-gsub("Biology", "Science", A)
A <-gsub("Cell Biology", "Science", A)
A <-gsub("Art", "Arts & Humanities", A)
A <-gsub("Humanities, Multidisciplinary", "Arts & Humanities", A)
A <-gsub("Psychology, Experimental", "Social Sciences", A)
A <-gsub("Astronomy & Astrophysics", "Science", A)
A <-gsub("Physics, Particles & Fields", "Science", A)
A <-gsub("Economics", "Social Sciences", A)
A <-gsub("Mathematics", "Science", A)
A <-gsub("Mathematics, Applied", "Science", A)
A <-gsub("Mathematics, Interdisciplinary Applications", "Science", A)
A <-gsub("Social Sciences, Mathematical Methods", "Social Sciences", A)
A <-gsub("Geriatrics & Gerontology", "Science", A)
A <-gsub("Gerontology", "Social Sciences", A)
A <-gsub("Management", "Social Sciences", A)
A <-gsub("Operations Research & Management Science", "Science", A)
A <-gsub("Computer Science, Artificial Intelligence", "Science", A)
A <-gsub("Computer Science, Information Systems", "Science", A)
A <-gsub("Engineering, Electrical & Electronic", "Science", A)
A <-gsub("Statistics & Probability", "Science", A)
}
B <- lapply(A, stringedit)
但它不能正常工作:
> B
[[1]]
[1] "Science"
[2] "Cell Science"
[3] "Arts & Humanities"
[4] "Arts & Humanities; Social Sciences"
[5] "Science; Science"
[6] "Social Sciences; Science, Interdisciplinary Applications; Social Sciences"
[7] "Science"
[8] "Social Sciences"
[9] "Social Sciences"
[10] "Operations Research & Social Sciences Science"
[11] "Computer Science, Arts & Humanitiesificial Intelligence; Science; Science"
[12] "Social Sciences; Science, Interdisciplinary Applications; Social Sciences; Science"
我怎样才能获得上述正确的输出?
非常感谢您的考虑!