Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Amit_B
Creator
Creator

Daily Average Calculation

Hi all,

I have a problem with calculate a daily average.

A model contains a daily table that shows the status of each product, day and store. It is a table that contains 50M rows and the number increases every day. I need to build measures that show the daily average of products that are in certain statuses (there are also sub-status). I give you a simple example.

At first I calculated it in another table in the script:
Summary:
Load

        1 as TypeCode,
        Date,
        Count(ProductID) as DailyCount,
        ...
From [table]
Where StatusID = 1
Group By Date;

Concatenate

Load
        2 as TypeCode,
        Date,
        Count(ProductID) as DailyCount,
        ...
From [table]
Where StatusID = 7 or StatusID = 8
Group By Date;

In the UI sheet I used the measure Avg({<TypeCode={[relevant code]}>}DailyCount) and I get a value that seem correct.
The problem - table like that takes a lot of time and resources (50M rows each time).

So I tried to calculate the measure without create this summary table.
Avg(Aggr(Count(ProductID),Date))
In this way I get lower and incorrect values, even though the calculation should be the same.

What am I missing here?
I think one of the reasons is that there are dates until the end of the year and not until today.

Thanks for the helpers.

2 Replies
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi @Amit_B 

Avoiding using the Avg function is often a good idea, as you can't be entirely sure what it is doing. If you do the divide yourself you can see both the total value and the thing it is being divided by to check you get what you want.

You can just load from the table once and set the TypeCode, like this:

LOAD
   if(StatusID = 1, 1, if(StatusID = 7 or StatusID = 8, 2, 3)) as TypeCode,

The aggregation by Date is also not required here.

With the count of ProductID, you will find adding a value in a table, rather than counting field values will be more efficient. If ProductID is a foreign key and there will always be an entry in the dimension table adding a counter to the Product table will speed things up:

LOAD
   1 as ProductCount,
   ProductID,

Your total number of products over all dates will be:

sum(aggr(sum(ProductCount), Date))

And that can be divided by the total number of dates where there is one or more products:

count(DISTINCT {<ProductCount*={1}>}Date)

Or, if you want to divide by the number of days from the first to the last date where there is a product:

(max({<ProductCount*={1}>}Date) - min({<ProductCount*={1}>}Date))+1

You may require a TOTAL inside the min and the max, if you don't want the dimension to be taken into account.

By breaking it up you can check both of the numbers before you do one over the other.

Hope that something in that is useful to you.

Cheers,

Steve

Amit_B
Creator
Creator
Author

Thanks for the detailed answer! I will check it.