0

数据集是这样的:

"1" 10 40 "r" "q" "0" "r" "r" "0" "r" "0" "0" "0" "0" "0" "t" "q" "0" "0" "s" "0" "r" 0 "0" 0 "0" "0" 0 0 0 "0"
"2" 10 173 "s" "s" "s" "0" "0" "s" "s" "0" "t" "t" "s" "t" "t" "r" "s" "0" "q" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"3" 10 2107 "t" "0" "0" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"4" 10 993 "s" "0" "q" "s" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"5" 10 1712 "t" "0" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "s" "0" "t" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"6" 776 1872 "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "r" 0 "0" "0" 0 0 0 "s"

输出应该是:

"1" 10 40 "r" "q" "0" "r" "r" "0" "r" "0" "0" "0" "0" "0" "t" "q" "0" "0" "s" "0" "r" 0 "0" 0 "0" "0" 0 0 0 "0"
"2" 10 173 "s" "s" "s" "0" "0" "s" "s" "0" "t" "t" "s" "t" "t" "r" "s" "0" "q" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"4" 10 993 "s" "0" "q" "s" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"5" 10 1712 "t" "0" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "s" "0" "t" "0" 0 "0" 0 "0" "0" 0 0 0 "0"

我尝试过的代码是:

x=read.table("sample.txt")
nrowx=nrow(x) 
for(i in 1:nrowx)
{
    count=0
    for(j in 3:30)
    {
        if(x[i,j]!=0)
        count = count+1
    }   
    if(count<4)
    x[i,]=NA    
}  
x=x[complete.cases(x),]

请提出一些不涉及循环的方法。

4

1 回答 1

1

看起来您的所有行都没有少于四个非零条目:

例如,使用tab表格打印每行的非零条目数:

apply(tab, 1, function(x)sum(x!="0"))
 [1] 12 16  5  7  7  5

例如,要消除所有少于 5 个非零条目的行,您可以这样做

tab[-which(apply(tab, 1, function(x)sum(x!="0"))<=5),]

但是,我不确定数据中的第一列是否被视为数据框中的列。

这有帮助吗?

于 2013-10-23T14:30:39.137 回答