Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Loading data from QVD

I have a scenario of fetching one year data from DB with incremental loading. Which approach among the below will be faster in terms of Load time and performance?

Extracting and

1. Storing a separate qvd for each date(totally 365 qvd's) in the year and loading it into Qlik

2. Storing a Month-wise qvd(totally 12 qvd's) and loading it into Qlik

3. Maintaining a single qvd for the entire year.

Please guide me!

Thanks in advance.

1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

I would opt for :

3. Maintaining a single qvd for the entire year.

Although option 1 would be quicker, but probably not significantly so and to a lesser extent option 2. - but this would normally be overshadowed by the additional complexity of scripting and maintaining.

If data volumes are large, say 10+ million rows a day, as in 3.65+ billion rows / year then I would consider 1 or 2 but only if 3 had be done first and performance was dreadful.

The bottle neck would probably be database latency in retrieving data from the database and I would not be surprised if that contributed to 99+% of the load time.


Note:  The numbers above are guesstimates as I know nothing about the details of your scenario.


View solution in original post

5 Replies
Anonymous
Not applicable
Author

I would opt for :

3. Maintaining a single qvd for the entire year.

Although option 1 would be quicker, but probably not significantly so and to a lesser extent option 2. - but this would normally be overshadowed by the additional complexity of scripting and maintaining.

If data volumes are large, say 10+ million rows a day, as in 3.65+ billion rows / year then I would consider 1 or 2 but only if 3 had be done first and performance was dreadful.

The bottle neck would probably be database latency in retrieving data from the database and I would not be surprised if that contributed to 99+% of the load time.


Note:  The numbers above are guesstimates as I know nothing about the details of your scenario.


YoussefBelloum
Champion
Champion

Hi,

For me the second option is the best, because:

in the case of a specific historical data recovery, Monthly Qvd's will the easier to maintain, for the daily QVD you will need to loop on them to make a refresh, and for the yearly Qvd you will have to do unnecessary refresh of hundreds or millions of lines.

tomasz_tru
Specialist
Specialist

First option will be the slowest one because of disc I/O. What kind of incremental it is? Can data change in any historical datapoint?

Tomasz

Ricardo_Gerhard
Employee
Employee

Dear Jeshwin,

If your database have is so fast to recover informations and daily update only, I guess full load will be more easily.

But, if you want to understand a field values changed, daily incremental load will be required.

Enjoy

Ricardo Gerhard
OEM Solution Architect
LATAM
ToniKautto
Employee
Employee

Neither options is right or wrong. To me two important aspects for the QVD granularity is the QVD file size and the QVD ingestion.

If apps consuming the QVD file(s) end up loading the file as unoptimized due to data transformation, then there might be room for altering the granularity. Keep in mind that data can still be filtered in an optimized fashion, for example by using the Exists() function.

Very large QVD files (+10GB) often end up being filtered during load, which naturally means they could have been pre-filtered by slicing the data into multiple files instead.
If you actually require very large volume of data (+10GB) in the same analysis, there are often better ways to slice the data during user access, for example with ODAG or Sessions apps. Then again, theses techniques also enable using smaller QVDs as they will filter the data as it is made available to the business user.