Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
MoeyE
Partner - Creator II
Partner - Creator II

Parallel load with ADLS target

Hi,

I'm wondering about a certain scenario.

Say for example, there is a large table that needs to go to an ADLS target using parallel load, and there are 5 parallel segments. 

Is there a possibility of say segment 4 loading before segment 3? Thus resulting in incorrect sequence in ADLS.

Also, are there any best practices regarding loading very large static tables to ADLS? My only current idea is parallel load. Thank you

Regards,

Mohammed

Mohammed

Labels (2)
1 Solution

Accepted Solutions
john_wang
Support
Support

Hello Mohammed, @MoeyE ,

There is no way to guarantee the partition initial load order at present. For example the partitionID and rows number in them as below:

P1: 1

P2: 1

P3: 1000000

P4: 10

P5...

In the scenario (let's say 1 row takes 1 second to load to target, just for brainstorm :)) then we see P4 data will get into target side before P3 data rows, even the table has PK.

So far I'd like to suggest loading each partition to a separate file by multiple tasks, and consume the files by file order after all partitions load done.

Regards,
John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

6 Replies
john_wang
Support
Support

Hello Mohammed, @MoeyE ,
Thanks for reaching to Qlik Community!

I guess I did not get the exact concern of "resulting in incorrect sequence in ADLS". Any explanation is welcome.

Thanks,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
john_wang
Support
Support

Hello @MoeyE ,

I'm guessing it's the order of the data rows say order by PK or given column(s). For example you want the data records in ADLS file(s) are in PK ascending or descending order.

As the partitions are triggered to load data without defined priority, it's possible any partition startup prior to another. If you want to control the order, I'd like to propose:

1- Define VIEWs for each partition in source database then put these views with load priority (See below screen copy)

2- Using multiple tasks which you may control the initial load data range in manual method

however please take note that in above approach the data records spreads to multiple ADLS Files, and in each file the data records are in order.

john_wang_0-1714118728684.png

 

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
MoeyE
Partner - Creator II
Partner - Creator II
Author

Hi John,

Thank you. Yes the goal is so that all data is loaded to the target in the same order as it exists on the source. So i'm just confirming. It is possible for partition 4's data to appear in the target before the target 3 data even if there is a primary key? thanks

Regards,

Mohammed

john_wang
Support
Support

Hello Mohammed, @MoeyE ,

There is no way to guarantee the partition initial load order at present. For example the partitionID and rows number in them as below:

P1: 1

P2: 1

P3: 1000000

P4: 10

P5...

In the scenario (let's say 1 row takes 1 second to load to target, just for brainstorm :)) then we see P4 data will get into target side before P3 data rows, even the table has PK.

So far I'd like to suggest loading each partition to a separate file by multiple tasks, and consume the files by file order after all partitions load done.

Regards,
John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
MoeyE
Partner - Creator II
Partner - Creator II
Author

Hi John,

Thanks for the explanations. I appreciate it.

Regards,

Mohammed

john_wang
Support
Support

Glad to hear that Mohammed @MoeyE ! please marked the comment as "Accept as Solution" if it worked for you.

Thanks for your great support,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!