I have the data set as below:
EMP_NAME MGRNAME STATUS MODIFIED_DATE
---------------------------------------------------
Amy John ACTIVE 01/15/2012 00:00:00
Amy Ken INSERVICE 06/08/2000 00:00:00
Amy Tom INACTIVE 04/02/2010 00:00:00
Ron David ACTIVE 01/15/2008 00:00:00
Keith Jack INACTIVE 08/10/2005 00:00:00
Keith Cat INACTIVE 04/30/2008 00:00:00
Keith Ken INACTIVE 02/04/2010 00:00:00
Mary Stephen INACTIVE 10/18/2010 00:00:00
Now, i should identify the duplicate rows based on the below conditions:
- If an Emp has 1 Mgr tagged, then we should not consider the same. Ex:- Ron, Mary
- If an Emp has been tagged to multiple Managers, then we need to check
if he is tagged to any of the manager as
ACTIVE
, then we should not consider the same. Fetch those records whose status is<> ACTIVE
for that Emp. Ex:- For Amy, we should exclude the record withACTIVE
status. We should fetch the records withINSERVICE
andINACTIVE
- If an Emp has been tagged to multiple Managers, but he is in INACTIVE status with all of them, then leave the max(
MODIFIED_DATE
) record and fetch the remaining records Ex:- For Keith, as both the Mgr records Status is INACTIVE, fetch the (MODIFIED_DATE
) records which are08/10/2005 00:00:00
and04/30/2008 00:00:00
The final output should look like below:
EMP_NAME MGRNAME STATUS MODIFIED_DATE
------------------------------------------------------
Amy Ken INSERVICE 06/08/2000 00:00:00
Amy Tom INACTIVE 04/02/2010 00:00:00
Keith Jack INACTIVE 08/10/2005 00:00:00
Keith Cat INACTIVE 04/30/2008 00:00:00