0

我是 R 编程的新手。我有一组存储在变量中的字符x1x25例如x1具有值"v21", "v345", "v212"x2 to x25 包含类似变化的字符值,例如"v45", "v67", "v556", "v21", "v44"它们(x1 to x25)的长度都不同。这些就像分析的结果。我想编写一个函数来比较字符值x1 to x25并输出在值中出现五次或更多的字符的结果x1 to x25。因此,例如,我希望看到如下结果:

"v21", "v67", "v556", "v45", "v44", "v212"

如果这些是出现的字符x1 to x25。我一直在进行目视检查并写下结果,但是这花费了我太多的时间。

如果这是可能的(我知道是),有人可以帮助我,这样我也可以从中学习。

谢谢

4

2 回答 2

3

首先,一个示例设置:

x1 <- c("v21", "v67", "v556", "v45", "v44", "v212")
x2 <- c("v21", "v67", "v556", "v45", "v44", "v212")
x3 <- c("v21", "v67", "v556", "v45", "v44", "v212")
x4 <- c("v21", "v67", "v556", "v45", "v44", "v212")
x5 <- c("v22", "v61", "v56", "v3", "v4", "v20")
x6 <- c("v22", "v61", "v56", "v3", "v4", "v20")
x7 <- c("v22", "v61", "v56", "v3", "v4", "v20")
x8 <- c("v22", "v61", "v56", "v3", "v4", "v20")
x9 <- c("v22", "v61", "v56", "v3", "v4", "v20")
x10 <- c("v556")
x11 <- c("v12","v345","v55")
x12 <- c("v12","v345","v55")
x13 <- c("v12","v345","v55")
x14 <- c("v12","v345","v55")
x15 <- c("v1", "v51", "v43", "v43")
x16 <- c("v1", "v51", "v43", "v43")
x17 <- c("v1", "v51", "v43", "v43")
x18 <- c("v1", "v51", "v43", "v43")
x19 <- c("v200")
x20 <- c("v200")
x21 <- c("v200")
x22 <- c("v39","v556","v41")
x23 <- c("v39","v556","v41")
x24 <- c("v39","v556","v41")
x25 <- c("v39","v556","v41")

单独存储 25 个变量可能会使它们难以全部使用。让他们一起使用

vars <- paste0("x",1:25)
corpus <- mget(vars)

然后corpus是一个包含所有数据的列表。要找到你想要的——所有出现至少 5 次的“v###”——创建一个表,然后对每个元素执行布尔测试。提取这些值的名称以获得“v###”。

valTable <- table(unlist(corpus))
keepers <- names(valTable[valTable >= 5])
keepers
# [1] "v20"  "v22"  "v3"   "v4"   "v43"  "v556" "v56"  "v61" 
于 2013-07-15T19:48:41.127 回答
1

假设您的 x 在列表中,这是一个答案。如果不是先做一个:

my.vars <- list(x1, x2, ..., x25)

corpus <- unique(unlist(my.vars))
occurences <- sapply(X=corpus,
                     FUN=function (k) {
                       occurences <- sapply(my.vars, function (l) k %in% l)
                       occurences <- sum(occurences)
                     })
names(occurences) <- corpus

i.want <- occurences[occurences >= 5]
于 2013-07-15T19:19:11.823 回答