Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
danielrozental
Master II
Master II

Turning Unoptimized Loads into Optimized Loads

I wanted to share a couple of test I did on trying to get loads to be as fast as possible and hopefully get some feedback or some other tips.

Sorry for the long post I believe it's worth reading through.

Test 1)

When concatenating two tables that don't have the same number of fields, if the second table has the same fields than the first one and then some extra fields the load will still be optimized, if done the other way around it will not be optimized.

e.g. All Loads are optimized

Table:

LOAD               // Optimized

     A,

     B,

     C

FROM Table1.qvd (qvd);

CONCATENATE(Table) // Optimized

LOAD

     A,

     B,

     C,

     D,

     E

FROM Table2.qvd (qvd);

e.g. 2nd Load isn't optimized

Table:

LOAD               // Optimized

     A,

     B,

     C,

     D,

     E

FROM Table2.qvd (qvd);

CONCATENATE(Table) // Not optimized

LOAD

     A,

     B,

     C

FROM Table1.qvd (qvd);

Test 2)

Second table has some fields in common with the first but is missing some, each table has 50 Million rows, 2nd load will not be optimized but loading the table in optimized mode, adding the missing fields, storing it and loading it again optimized will be faster than just concatenating the tables straight up.

e.g. 2nd load isnt optimized (in this example it took about 1 min to load. PC is Core i5 x64 4 GB running Windows 7).

R00:

LOAD ShipperID,       // Optimized

     OrderDate,

     CustomerID,

     UnitPrice,

     sales,

     COS

FROM

R00_1.QVD

(qvd);

Concatenate(R00)      // Not Optimized

LOAD ShipperID,

     CustomerID,

     Discount,

     ProductID,

     Quantity,

     UnitPrice

FROM

R00_2.QVD

(qvd);

If you load the 2nd table without concatenating it, add the missing fields store it and load it again to concatenate it while reading it optimized it will be faster (In my example took 50% of the time).

R00:

LOAD ShipperID,                // Optimized

     OrderDate,

     CustomerID,

     UnitPrice,

     sales,

     COS

FROM

R00_1.QVD

(qvd);

R000_Aux:

LOAD ShipperID,                // Optimized

     CustomerID,

     Discount,

     ProductID,

     Quantity,

     UnitPrice

FROM

R00_2.QVD

(qvd);

concatenate(R000_Aux)          // Not Optimized. 0 records are added

LOAD null() as ShipperID,

     null() as OrderDate,

     null() as CustomerID,

     null() as UnitPrice,

     null() as sales,

     null() as COS

autogenerate(0);

store R000_Aux into R000_Aux.QVD;

drop table R000_Aux;

concatenate(R00)                //This load will now be optimized!

LOAD ShipperID,

     OrderDate,

     CustomerID,

     UnitPrice,

     sales,

     Discount,

     ProductID,

     Quantity,

     COS

FROM

R000_Aux.QVD

(qvd);

3 Replies
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi there,

Just wanted to add here that I have recently blogged about Optimised QVD Loads, giving details of the scenarios in which QVD loads will be optimised and also why it is critical that your loads are Optimised:

http://bit.ly/YnAMqT

- Steve

http://www.quickintelligence.co.uk/

tseebach
Luminary Alumni
Luminary Alumni

Hi Daniel,

That is an interesting concept. You do however need to take into account that storing a qvd to disk can be very slow. Since many servers are on SAN or use older disk based HD's.

danielrozental
Master II
Master II
Author

Hi Torben, thanks for answering my 4 year old post .

Loading and storing a QVD might not be the best way to approach this, but you could add dummy fields in some QVDs at creation time just to improve performance on concatenations done later on. I recon this might not apply in any case and should probably not be taken as a general solution but we've been using this sort of hack to improve load performances on large volumes over the last years very succesfully. You could also have servers with small SSD drives.

I just tested this with QV12 Beta and it's still stands. Test2 2nd option takes 30% less time to run than 1st option.

Honestly I would have expected QlikView to pick up this sort of optimizations automatically without resorting to any file saving.