I have the data set as below:
EMP_NAME MGRNAME STATUS MODIFIED_DATE
---------------------------------------------------
Amy John ACTIVE 01/15/2012 00:00:00
Amy Ken INSERVICE 06/08/2000 00:00:00
Amy Tom INACTIVE 04/02/2010 00:00:00
Ron David ACTIVE 01/15/2008 00:00:00
Keith Jack INACTIVE 08/10/2005 00:00:00
Keith Cat INACTIVE 04/30/2008 00:00:00
Keith Ken INACTIVE 02/04/2010 00:00:00
Mary Stephen INACTIVE 10/18/2010 00:00:00
Now, i should identify the duplicate rows based on the below conditions:
- If an Emp has 1 Mgr tagged, then we should not consider the same. Ex:- Ron, Mary
- If an Emp has been tagged to multiple Managers, then we need to check
if he is tagged to any of the manager as
ACTIVE, then we should not consider the same. Fetch those records whose status is<> ACTIVEfor that Emp. Ex:- For Amy, we should exclude the record withACTIVEstatus. We should fetch the records withINSERVICEandINACTIVE - If an Emp has been tagged to multiple Managers, but he is in INACTIVE status with all of them, then leave the max(
MODIFIED_DATE) record and fetch the remaining records Ex:- For Keith, as both the Mgr records Status is INACTIVE, fetch the (MODIFIED_DATE) records which are08/10/2005 00:00:00and04/30/2008 00:00:00
The final output should look like below:
EMP_NAME MGRNAME STATUS MODIFIED_DATE
------------------------------------------------------
Amy Ken INSERVICE 06/08/2000 00:00:00
Amy Tom INACTIVE 04/02/2010 00:00:00
Keith Jack INACTIVE 08/10/2005 00:00:00
Keith Cat INACTIVE 04/30/2008 00:00:00