Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
lawrance
Creator II
Creator II

Remove Duplicates

Hi all,

To remove duplicates in following tables.

For Example,

1) We have 10 records, the 5 and 6 records having same data except modified date column.

2) Here, we need to get the max modified date from 5 and 6 and remove the min modified records.

3) Suppose the 5th record is max we need to show 9 records from table except 6th record.

Sample Data:

MTR_IDNameModified Date timestamp
ABC01Lawrance5/27/2017 12:00:00 AM
ABC02Bose5/27/2017 1:00:00 AM
ABC03Prathap5/27/2017 2:00:00 AM
ABC04Vengadesh5/27/2017 3:00:00 AM
ABC05Ranjit5/27/2017 4:00:00 AM
ABC06Sathish5/27/2017 5:00:00 AM
ABC07Sai Prasad5/27/2017 6:00:00 AM
ABC07Sai Prasad5/27/2017 6:30:00 AM
ABC08Muthu5/27/2017 7:00:00 AM
ABC09Irshad5/27/2017 8:00:00 AM
ABC10Naveen5/27/2017 9:00:00 AM

Expected Output:

MTR_IDNameModified Date timestamp
ABC01Lawrance5/27/2017 12:00:00 AM
ABC02Bose5/27/2017 1:00:00 AM
ABC03Prathap5/27/2017 2:00:00 AM
ABC04Vengadesh5/27/2017 3:00:00 AM
ABC05Ranjit5/27/2017 4:00:00 AM
ABC06Sathish5/27/2017 5:00:00 AM
ABC07Sai Prasad5/27/2017 6:30:00 AM
ABC08Muthu5/27/2017 7:00:00 AM
ABC09Irshad5/27/2017 8:00:00 AM
ABC10Naveen5/27/2017 9:00:00 AM

Thanks,

Lawrance A

1 Solution

Accepted Solutions
Anil_Babu_Samineni

You can use, Source table as well like below

Load MTR_ID, Name, MAX([Modified Date timestamp]) AS [Modified Date timestamp] GROUP BY MTR_ID, Name;

LOAD MTR_ID, Name, [Modified Date timestamp] From <Source Path>;

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful

View solution in original post

6 Replies
Nicole-Smith

Using max() returns the output that you want:

LOAD MTR_ID, Name, MAX([Modified Date timestamp]) AS [Modified Date timestamp]

INLINE [

    MTR_ID, Name, Modified Date timestamp

    ABC01, Lawrance, 5/27/2017 12:00:00 AM

    ABC02, Bose, 5/27/2017 1:00:00 AM

    ABC03, Prathap, 5/27/2017 2:00:00 AM

    ABC04, Vengadesh, 5/27/2017 3:00:00 AM

    ABC05, Ranjit, 5/27/2017 4:00:00 AM

    ABC06, Sathish, 5/27/2017 5:00:00 AM

    ABC07, Sai Prasad, 5/27/2017 6:00:00 AM

    ABC07, Sai Prasad, 5/27/2017 6:30:00 AM

    ABC08, Muthu, 5/27/2017 7:00:00 AM

    ABC09, Irshad, 5/27/2017 8:00:00 AM

    ABC10, Naveen, 5/27/2017 9:00:00 AM

]

GROUP BY MTR_ID, Name;

el_aprendiz111
Specialist
Specialist

Hi,

TMP:
LOAD *,  IF(Previous(MTR_ID & Name)<>(MTR_ID & Name),'',Previous([Modified Date timestamp])) AS PRM
;
LOAD *
INLINE [
MTR_ID, Name, Modified Date timestamp
ABC01, Lawrance, 5/27/2017 12:00:00 AM
ABC02, Bose, 5/27/2017 1:00:00 AM
ABC03, Prathap, 5/27/2017 2:00:00 AM
ABC04, Vengadesh, 5/27/2017 3:00:00 AM
ABC05, Ranjit, 5/27/2017 4:00:00 AM
ABC06, Sathish, 5/27/2017 5:00:00 AM
ABC07, Sai Prasad, 5/27/2017 6:00:00 AM
ABC07, Sai Prasad, 5/27/2017 6:30:00 AM
ABC08, Muthu, 5/27/2017 7:00:00 AM
ABC09, Irshad, 5/27/2017 8:00:00 AM
ABC10, Naveen, 5/27/2017 9:00:00 AM
]
;

[PARAM]:
LOAD Concat(CHR(39) & PRM & CHR(39),',') AS PRMt Resident TMP;
LET v_PRM = Peek('PRMt');

NoConcatenate
SUMARY:
LOAD * Resident TMP Where NOT wildMatch([Modified Date timestamp], $(v_PRM));

DROP Field PRM;
DROP Table PARAM,TMP;



EXIT Script;

Digvijay_Singh

May be in front end like this -

Capture.PNG

Kushal_Chawda

Data:

LOAD MTR_ID,

          Name,

          Timestamp([Modified Date timestamp],'DD/MM/YYYY hh:mm:ss') as TimeStamp

FROM Table;

inner join(Data)

LOAD MTR_ID,

          Name,

        Timestamp(max([Modified Date timestamp]),'DD/MM/YYYY hh:mm:ss') as TimeStamp

Resident Data

Group by

          MTR_ID,

          Name;


lawrance
Creator II
Creator II
Author

Thanks for your response.

You mentioned using Inline, but I have source in CSV file.

Could you please explain me how to achieve this.

Thanks,

Lawrance A

Anil_Babu_Samineni

You can use, Source table as well like below

Load MTR_ID, Name, MAX([Modified Date timestamp]) AS [Modified Date timestamp] GROUP BY MTR_ID, Name;

LOAD MTR_ID, Name, [Modified Date timestamp] From <Source Path>;

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful