1

我之前收到了很多帮助,但我刚刚遇到了另一个问题,想知道是否有人有任何见解。

上一篇文章中,我写道我有一个数据集(它实际上有大约 50 行),我们称它为“Times”:

> Times <- read.csv(“Times.csv”, stringsAsFactors=FALSE, header=TRUE)
> Times

Num     Start          End
1    00:09:41    00:25:025
2    00:11:21     00:41:32
3    00:34:39     00:58:01

然后,为了找到重叠的时间间隔,有人建议我创建一个带状矩阵——比较所有的行。

Overlap <- outer (Times$Start, Times$End, function (x,y) y > x)
Overlap [upper.tri (Overlap) | col (Overlap) = = row(Overlap)] <- NA
Overlap

       [,1]   [,2]   [,3]           
[1,]     NA     NA     NA
[2,]   TRUE     NA     NA
[3,]  FALSE   TRUE     NA

所以在这一点上,我知道哪些行重叠,但理想情况下,我希望有一个类似于我的原始数据框的输出,但不包括那些不与任何其他行重叠的行。

有没有办法省略不包含 TRUE 的行?是否可以将其转换回数据框?

感谢您提供的任何帮助!

4

2 回答 2

1

排除不与任何其他行重叠的行。

Times[rowSums(is.na(Overlap)) < ncol(Overlap),]

编辑

由于您只对重叠矩阵的下部感兴趣

 Overlap [upper.tri (Overlap) | col (Overlap) = = row(Overlap)] <- NA

您可以跳过此步骤并使用原始重叠的下部来获得这个简单的解决方案:

Overlap <- outer (Times$Start, Times$End, function (x,y) y > x)
Times[rowSums(lower.tri(mdat)) >0 ,]
于 2013-07-17T22:25:03.683 回答
1

怎么样....

exc <- apply( Overlap , 1 , function(x) all( is.na(x) ) )

nonoverlap <- Times[ ! exc , ]

基本上,我们查看Overlap矩阵的每一行,TRUE如果所有值都是NA. 然后我们使用它来对数据框进行子集化Times,不包括所有NAOverlap.

于 2013-07-17T22:25:16.757 回答