Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
s2016
Contributor III
Contributor III

Creating a unique ID field

Hi,

I have a fact table data in daily QVD's (e.g. Traffic_20170827.qvd). Each file has multiple "entity_name_raw" and each day there are some new entity_name_raw. I need to assign a unique ID to each entity_name_raw.

I am using the below script to do that. Although this script runs fine on most of the days but sometimes a different ID is assigned to a entity_name_raw each day the script runs.

Entity_Bigpond_TID:

LOAD

          Distinct

          entity_name_raw

FROM

[$(vL.QVDDestPath)Traffic_*.qvd]

(qvd);

Entity_temp:

Load

          entity_name_raw,

          text('E_BT' & num(AutoNumber(entity_name_raw, 'auto1'), '000000000'))  as entity_id      

Resident Entity_Bigpond_TID;

Can anyone suggest a better way to achieve this?

Regards,

Shashank

1 Solution

Accepted Solutions
sunny_talwar

I have created a sample for you...

Day 1

Table:

LOAD Entity

FROM

FirstFile.xlsx

(ooxml, embedded labels, table is Sheet1);

UniqueID:

LOAD Entity,

If(Entity = Previous(Entity), Peek('RowNo'), RangeSum(Peek('RowNo'), 1)) as RowNo

Resident Table

Order By Entity;

STORE UniqueID into UniqueID.qvd (qvd);

You run your main fact table and create a new table UniqueID where you create the ID (I have used the name RowNo)

Day 2 and Onwards (In the sample I have did Day2 and Day3... but the script is the same)

UniqueID:

LOAD Entity as EntityCheck,

    RowNo

FROM UniqueID.qvd (qvd);

MaxID:

LOAD Max(RowNo) as Max

Resident UniqueID;

LET vMaxID = Peek('Max');

TRACE $(vMaxID);

DROP Table MaxID;

Table:

LOAD Entity

FROM

SecondFile.xlsx

(ooxml, embedded labels, table is Sheet1);

Temp:

NoConcatenate

LOAD Entity as EntityCheck,

If(Entity = Previous(Entity), Peek('RowNo'), RangeSum(Alt(Peek('RowNo'), $(vMaxID)), 1)) as RowNo

Resident Table

Where not Exists(EntityCheck, Entity)

Order By Entity;

FinalUniqueID:

LOAD EntityCheck as Entity,

RowNo

Resident UniqueID;

DROP Table UniqueID;

Concatenate(FinalUniqueID)

LOAD EntityCheck as Entity,

RowNo

Resident Temp;

STORE FinalUniqueID into UniqueID.qvd (qvd);

RENAME Table FinalUniqueID to UniqueID;

DROP Table Temp;

So, basically Day 1 is needed one time to initiate the qvd building process. Once created, you will always need to run Day 2 and onward script...

In the sample just run the script for one day at a time and comment the other days and start with Day 1.

Best,

Sunny

View solution in original post

15 Replies
sunny_talwar

Although this script runs fine on most of the days but sometimes a different ID is assigned to a entity_name_raw each day the script runs.

I am not sure I understand this? A different ID is assigned to a entity_name_raw each day? This happens only sometimes? Can you elaborate of may be show some examples?

An alternative might be this, but not sure if this will address the above issue or not

Entity_Bigpond_TID:

LOAD

          Distinct

          entity_name_raw

FROM

[$(vL.QVDDestPath)Traffic_*.qvd]

(qvd);

Entity_temp:

LOAD *,

          text('E_BT' & num(RowNo, '000000000'))  as entity_id;

LOAD

          entity_name_raw,

         If(entity_name_raw = Previous(entity_name_raw), Alt(Peek('RowNo'), 1), RangeSum(Peek('RowNo'), 1)) as RowNo

Resident Entity_Bigpond_TID

Order By entity_name_raw;

s2016
Contributor III
Contributor III
Author

Hi Sunny,

Look at the screenshot below. Until 20178/20, E_CM000000001 was assigned to a particular Entity and it changed later.  

Entity.JPG

sunny_talwar

Did you try the approach I provided above?

jonathandienst
Partner - Champion III
Partner - Champion III

Do you mean that a different ID is assigned by Autonumber? Autonumber() starts from scratch at each load and it numbers on the load order sequence. If your source data changes the load order, then the autonumber will change as well.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
s2016
Contributor III
Contributor III
Author

That's true Jonathan. My assumption was that when I am loading the data from QVD's using Traffic_*.qvd method the new entities would automatically be appended in the last.

Correct me if I am wrong, but my understanding is that all the distinct entity_name_raw from the first QVD would be loaded in memory first and then all the new entity_name_raw from other QVD's (which are not present in the first QVD) would be appended at the end.


Can you help me understand how load using *.qvd works. 

s2016
Contributor III
Contributor III
Author

Hi Sunny,

I reloaded all my data today using your code and it seems to be working fine at the moment.

I will keep you posted.

Thanks a lot for the help. Appreciate it!!

marcus_malinow
Partner - Specialist III
Partner - Specialist III

Hi Shashank,

in order to avoid collisions like this try using AutoNumberHash, instead of AutoNumber.

The key difference here is that AutoNumberHash is deterministic, and thus should avoid assigning the same result to more than one row.

https://community.qlik.com/message/683831

Marcus

s2016
Contributor III
Contributor III
Author

Reading through this does make sense. Thanks Marcus.

I think this should be the correct solution.


At the moment, Sunny's code is working fine for me but I will test the AutoNumberHash() funtion as well. 

s2016
Contributor III
Contributor III
Author

Hi Guys,

Both the approaches failed today (Sunny's and Marcus') because both of them are dependent on the Load Order of the entity_name_raw  from the QVD files. The entity_name_raw  must be in a certain fixed order with new ones added at the end for both the methods to work correctly. 

To my surprise, QV doesn't keep the load order from the original files and loads them in random order sometimes (which is strange).

Can someone explain how the ordering works while loading the data and why does QV shuffle the load order?