Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to avoid duplication of records before uploading to Transformation layer during monitoring mistakes of users

Hi,

I have an existing tool where it will searches the mistakes in flat files of users and display in the Mistake monitor qvw. Mistakes would be like Field not found, Bad zip file as below.

The above process takes place post extraction layer and before the Transformation layer. The input file of the transformation layer has some duplicate records based on the product description field.

It is acceptable for the product description to appear more than once if the SKU is reporting to a different Market Product Type or Sales Channel , but if the combination of Product Description, Sales Channel and Market Product Type is duplicated then we have trouble.

This scenario is OK as all records are unique:

Sales Channel

SKU

Manufacturer

Market Product Type

Hypermarkets

UB Long Grain 1kg BiB

Horlicks

Long Grain

Hypermarkets

UBLong Grain 1kg BiB

Horlicks

Boil In Bag

Rewe

UB Long Grain 1kg BiB

Horlicks

Long Grain

Rewe Hypermarkets

UB Long Grain 1kg BiB

Horlicks

Long Grain

Rewe Hypermarkets

UB Long Grain 1kg BiB

Horlicks

Boil in Bag

But if we have the below then we will get incorrect calculations in the next layer:

Sales Channel

SKU

Manufacturer

Market Product Type

Hypermarkets

UB Long Grain 1kg BiB

Horlicks

Long Grain

Hypermarkets

Uncle Bens Long Grain 1kg BiB

Horlicks

Boil In Bag

Rewe

Uncle Bens Long Grain 1kg BiB

Horlicks

Long Grain

Rewe Hypermarkets

Uncle Bens Long Grain 1kg BiB

Horlicks

Long Grain

Rewe Hypermarkets

Uncle Bens Long Grain 1kg BiB

Horlicks

Boil in Bag

Hypermarkets

UB Long Grain 1kg BiB

Horlicks

Long Grain

Is there any way to check for this duplication before the files are uploaded to next layer.


Please provide your suggestions how to remove the duplicates in the existing model.


Thanks in advance!!



Regards

Velan

Message was edited by: Velan Krishna

2 Replies
marcus_sommer

To avoid duplicated rows you could use a Load DISTINCT. Alternatively to show which rows are duplicates you could count them in a (additionally) load:

CountRows:

Load ID, if(count([Sales Channel]&[SKU]&[Market Product Type])=1, 'yes', 'no') as Duplicates Resident x group by ID;


- Marcus

Not applicable
Author

Thank you for your valuable time Marcus!! Let me try this and update.

-Velan