r - How to create a logical vector that indicates whether the values in two columns are the same across categorical factors in R?

Question

I'm trying my best to articulate this, so here goes.

I have a table of gene information. However, I am going to be using a generic example for the sake of this question.

> test_dt <- data.table(c("b", "a", "a", "b"), c(1, 4, 1, 5), c(4, 6, 4, 8))
> colnames(test_dt) <- c("category", "start", "end")
> test_dt
   category start end
1:        b     1   4
2:        a     4   6
3:        a     1   4
4:        b     5   8

I want to append an additional column to this table that indicates whether start and end are the same across different category values (in my case as well as in this example, I am only dealing with two categories):

   category start end in_both
1:        b     1   4    TRUE
2:        a     4   6   FALSE
3:        a     1   4    TRUE
4:        b     5   8   FALSE

I know this seems painfully basic but there are holes in my R knowledge that periodically need to be filled and paved over. How would I go about doing this?

score 2 · Accepted Answer

一种选择可能是：

test_dt[, in_both := uniqueN(category) == 2, by = c("start", "end")]

   category start end in_both
1:        b     1   4    TRUE
2:        a     4   6   FALSE
3:        a     1   4    TRUE
4:        b     5   8   FALSE

score 1 · Accepted Answer

一个选项all和%in%

test_dt[, in_both := all(c('b', 'a') %in% category), .(start, end)]
test_dt
#   category start end in_both
#1:        b     1   4    TRUE
#2:        a     4   6   FALSE
#3:        a     1   4    TRUE
#4:        b     5   8   FALSE

r - How to create a logical vector that indicates whether the values in two columns are the same across categorical factors in R?

2 回答 2

Related

Reference