0

这是我尝试在不同数据集上执行的操作的示例,但这仍然无法正常工作

PORT    STATUS   VESSEL         DWT      IMP/EXP    QTY (Mts)

1 KANDLA    SAILED  CAPTAIN HAMADA  7938 EXP   4500
2 KAKINADA  EXPECTED CELON BREEZE       IMP      30000
3  KAKINADA BERTH    CELON BREEZE       IMP     3000
4 KAKINADA  SAILED   CELON BREEZE       IMP     30000
5 KANDLA    ANCHORAGE CAPTAIN HAMADA    EXP  4500
6 KAKINADA  BERTH    CELON BREEZE       IMP     30000

我想将一行的(PORT,VESSEL,IMP/EXP)与另一行进行比较,如果匹配则删除,如果行中的 IMP/EXP 为“IMP”,然后按状态的优先顺序删除该行:sailed> berth > anchorage > expected 它将给予sailed =status 和其他有锚地的最高优先级并删除第二行,因为它与第四行的数量、港口、船只相匹配。依此类推,如果条件匹配,则查看

  1 ) status=sailed and other have berth ,it will delete berth row
  2) sailed and other have expected,it will delete expected row
   3)if some row have berth and other have anchorage will delete anchorage
  4)if some has expected=STATUS & other row have sailed=STATUS it will delete              

    "expected"=STATUS   row        

so on Row 应该匹配条件即 qty,port,vessel 根据条件删除行

对于 IMP/EXP 中的 EXP,它应该匹配
STATUS 中优先级的条件,即数量、端口、船舶条件:

     priority- sailed>anchorage>expected>  berth

输出应该是

PORT    STATUS   VESSEL              DWT    IMP/EXP QTY (Mts)

1 KANDLA    SAILED  CAPTAIN HAMADA  7938         EXP    4500
3  KAKINADA BERTH    CELON BREEZE             IMP      3000
4 KAKINADA  SAILED   CELON BREEZE             IMP      30000

2nd,5TH,6th 行被删除是想要的输出

4

1 回答 1

1

首先,您需要在 data.frame 中将数据读入 R。data.frametest应如下所示:

>test

#      PORT    STATUS         VESSEL  DWT IMPEXP   QTY
#1   KANDLA    SAILED CAPTAIN HAMADA 7938    EXP  4500
#2 KAKINADA  EXPECTED   CELON BREEZE   NA    IMP 30000
#3 KAKINADA     BERTH   CELON BREEZE   NA    IMP  3000
#4 KAKINADA    SAILED   CELON BREEZE   NA    IMP 30000
#5   KANDLA ANCHORAGE CAPTAIN HAMADA   NA    EXP  4500
#6 KAKINADA     BERTH   CELON BREEZE   NA    IMP 30000

使用plyr包的ddply功能,您应该能够在 tfollowing 功能的帮助下获得所需的输出。

ddply(test,.variables = c("PORT","VESSEL","IMPEXP","QTY"),
  function(t){if(t$IMPEXP[1]=="IMP"){
    t$STATUS<-factor(x = t$STATUS,levels =c("EXPECTED","ANCHORAGE","BERTH","SAILED"),ordered = T)
    return(t[which.max(as.integer(t$STATUS)),])
  }else{
    t$STATUS<-factor(x = t$STATUS,levels =c("BERTH","EXPECTED","ANCHORAGE","SAILED"),ordered = T)
    return(t[which.max(as.integer(t$STATUS)),])}
  }
)

#PORT STATUS         VESSEL  DWT IMPEXP   QTY
#1 KAKINADA  BERTH   CELON BREEZE   NA    IMP  3000
#2 KAKINADA SAILED   CELON BREEZE   NA    IMP 30000
#3   KANDLA SAILED CAPTAIN HAMADA 7938    EXP  4500
于 2017-07-21T09:16:29.200 回答