0

我有两个看起来像这样的数据框

view
 id object date maxdate
  1      a    8       9
  1      b    8       9
  2      a    8       9
  3      b    7       8
purchase
 id date object  purchased
  1    9      a   1
  2    8      a   1
  3    8      b   1

一个是查看产品时的表格,另一个是是否以及何时购买该产品 - 查看后可以在 24 小时内购买。我想在列 id、日期和对象或 id、maxdate=date 和对象上合并它们,在 full_join (dplyr) 中实现该或条件的最佳方法是什么?下面是数据框的代码和我正在寻找的输出

 id object date maxdate   purchased
  1      a    8       9   1
  1      b    8       9   NA
  2      a    8       9   1
  3      b    7       8   1

id=c(1,1,2,3)
 object=c("a","b","a","b")
 date=c(8,8,8,7)
 maxdate=c(9,9,9,8)
 view=data.frame(id,object,date,maxdate)`
id=c(1,2,3)
 date=c(9,8,8)
object=c("a","a","b")
purchased=(1,1,1)
 purchase=data.frame(id,date,object,purchased)

到目前为止,我已经尝试过这样的事情,但是当它是大型数据集时,清理效率非常低且令人困惑

a=merge(view,purchase, by="id")
a$ind=ifelse(a$object.x==a$object.y & (a$date.x==a$date.y | a$maxdate==a$date.y),1,"NA")
4

1 回答 1

0

你想做这样的事情吗?

a=merge(view[,-4],purchase, by=c("id", "object"))
names(a) = c("id", "object", "date.viewed", "date.purchased", "purchased")

> a
  id object date.viewed date.purchased purchased
1  1      a           8              9         1
2  2      a           8              8         1
3  3      b           7              8         1
于 2020-07-28T16:53:04.730 回答