Skip to main content
Announcements
Qlik Announces Qlik Talend Cloud and Qlik Answers: LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
qlikuser233
Contributor II
Contributor II

Join data duplicating records

I am joining two data sources, the issue is that I am brining in certain columns for each and they share 4 columns with the primary key being unique identifier. The columns that are missing from Tracker 2 are populating as 0's instead of merging into one line.

Tracker: 
LOAD  Client, 
     [Client ID], 
     [Team Cd], 
     [Policy/Contract Effective Date], 
     [Policy/Contract Expiration Date], 
     [Pol/Cont Num], 
     [Captive or Non-Captive],
     [Policy/Contract Effective Date Conversion], 
     Adjustment,
     [Client ID]&[Pol/Cont Num]&[Policy/Contract Effective Date] as [Unique Identifier], 
[3133]
     [3137]
 
Outer join (Tracker)
 
Tracker_2:
LOAD A as [Client ID], 
     B as [Client], 
   C as [Pol/Cont Num]
     D as [Policy/Contract Effective Date],
     I as [Unique Identifier], 
Z as [Timing - Waiting for Registration (New/RNW or Adjustment) - Premium - COU, PAU, ACR],

Attached is a photo for one record. It splits into 2 lines, the top line is coming from tracker 2 as it has 0's where it should be filling in the information from Tracker. I have crossed out the unique identifier number but they are identical so I cannot figure out why this is creating a line for each source rather than merging them when the primary key for both sources is identical. 

Labels (1)
12 Replies
qlikuser233
Contributor II
Contributor II
Author

I can see that in the table it is taking my unique ID and breaking it down into both sources so I have a line for each. The first row contains data for a column not in Tracker 2 and vice versa for Tracker. So why is it breaking the data sources out instead of merging them? @marcus_sommer 

marcus_sommer

Like @Or mentioned a join is performed against all key-fields which means that all field-values of them must be matching.

Further missing keys and/or any duplicates will have an impact on the resulting data-set and may adding respectively removing records (depending on the applied join-type). To get rid of it could be quite difficult especially if there is not enough knowledge and control in regard to the data-quality. Therefore I suggest to change the join-logic to a mapping which is often easier, more flexible and performant and without any risk to change the number of records:

Don't join - use Applymap instead - Qlik Community - 1467592

qlikuser233
Contributor II
Contributor II
Author

I have figured out the issue, there was a same column name from both sources one was a calculated field and the other a hard coded input that was different once I removed there was no issue. Thanks everyone