Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Aggregation problem

Hi all,

I have data as QUARTER, MONTH, RESID. For each RESID I need to count its occurrences in the data set but only if it occurs more than, say, 5 times. I have figured out how to accomplish this when counting all occurrences of RESID with the above mentioned condition by using the following expression:

=sum(aggr(nodistinct if (count(RESID)>5, 1, 0), MONTH, RESID))

However, if I remove the nodistinct keyword (or change it to distinct) to count distinct RESIDs instead of all occurrences I get very odd results as many RESID that do occur more than five times don't get any value on RESID level and thus sum up as zero.

Edit: I know it should not be zero as I also have another expression in the pivot showing the number of occurrences ( =count(RESID) ).

This might be something rather basic as I feel I haven't really grasped some of concepts of QV that behave differently from more traditional products...

Thanks in advance,

Esa

1 Solution

Accepted Solutions
Not applicable
Author

hi,

I tried another solution. It is attached.

Best regards,

http://quickdevtips.blogspot.com

View solution in original post

15 Replies
Not applicable
Author

Hi Esa,

A simple example would illustrate what's the difference between distinct and nodistinct aggr. Say if we have a table

Table1:

key, value

1,a

1,b

2,c

2,d

3,a

aggr(distinct count(value),key) would produce the following output

key, distinct count(value)

1,2

2,2

3,1

aggr(nodistinct count(value),key) would produce the following output

key, nodistinct count(value)

1,2

1,2

2,2

2,2

3,1

If you do not specify distinct or nodistinct keyword, distinct would be assumed. That's why when you remove your nodistinct keyword, the sum would be different as a different table is produced by aggr function. Hope it helps.

Regards,

Xue Bin

hic
Former Employee
Former Employee

I assume that count(RESID)>5 should be defined globally and not just within each grouping defined by the group by fields? If so, I suggest you put the if function outside the aggregation:

if(count(RESID)>5, <aggregation>, null())

Secondly, you have it in a pivot table. If you have Month and RESID as dimensions, then I do not understand why you use the Aggr()-function. It would work straight off with

if(count(total RESID)>5, count(RESID), null())

HIC

Not applicable
Author

I do want my counting to be defined on group level, so that the user can see the number of RESIDs on the desired level. The purpose is to follow the activity of resources on yearly, quarterly and monthly level, and to see the number of resources that have more than minimal amount of activity.

I have RESID as a dimension merely for the purpose to be able to verify my results, in the final version I will drop it off and just have the (time) dimensions plus the number of individual resources that have had more than minimal amount of acitivty within the time period the grouping.

Trying

=if(count(total RESID)>5, count(RESID), null())

produces exactly the same results as having just

=count(RESID)

but without "total"

=if(count(RESID)>5, count(RESID), null())

it does implement what I want, but still it does not solve the requirement of counting only distinct RESIDs fulfilling the condition. That is why I ended up putting "1" instead of "count(RESID)" in the THEN part of the IF clause, and that's probably why I ended up messing around with the aggregation (to get all the 1s added up).

I'll keep on trying to wrap my brain around this. I might have already solved this in Cognos (which I have more experience on) and I guess the biggest obstacle I have is my failure at forgetting the Cognos way of doing things...

Br,

Esa

hic
Former Employee
Former Employee

Then you should try =if(count(RESID)>5, count(distinct RESID), null())

HIC

Not applicable
Author

Henric Cronström wrote:

Then you should try =if(count(RESID)>5, count(distinct RESID), null())

HIC

Unfortunately that results the same numbers as with having just

=count(distinct RESID)

thus disabling the condition count>5 from the aggregation. That's exactly what I had before I received the additional requirement of weeding out those resources that have minimal activity.

So basically my problem is how I get that additional condition to be applied to all aggregation.

I'll try to attach an annotated pic with screen copies from the report. In those,

"Assigned resources"=if(count(RESID)>5, count(distinct RESID), null())

and

"Assignments"=count(RESID)

http://onelove.doesntexist.org/~etikka/extra/sample_results1.GIF

In the upper part of the pic you see that for the month "huhti" none of the resources have over 5 assignments (thus count(RESID)<=5 meaning do not count those).

In the lower part where the months have been collapsed "Assigned resources" results 3 for month "huhti" although it should be zero/null. In fact, the results are quite odd since the resource with the ID starting "f9a..." should not be counted at all even on the quarter level as there are not enough assignments (occurrences in the data) to qualify that.

Thanks for your time anyway.

Br,

Esa

Not applicable
Author

Thanks Xue Bin for the clarification on nodistinct and distinct.

Not applicable
Author

I guess that from the screenshot it becomes clear that my issue is actually caused by having the expression evaluated always on the grouped numbers, not the data on the detail level or even on the lowest level in the dimension hierarchy in the pivot.

Could someone please tell me if the thing I'm trying to do is somehow possible to accomplish directly in QV or is it implicitly "forbidden" to do such things with this tool?

Browsing the docs it seems that grouping data is possible in load scripts so should I there calculate the values for some base interval and then just use that in the pivot?

Br,

Esa

Message was edited by: Esa Tikka after reading the manual...

Not applicable
Author

Hi,

In the pivot tabIe (which is seen in your screenshots above), could you try the following expression for 'Assigned resources'?

=sum(

     if(aggr(count(ResID)>5, Year,Quarter,Month,ResID)=-1,1,0)

)

Regards,

Not applicable
Author

Ibaydin,

On ResID level that works ok, but on Quarter and Year levels it gives way too high values, which I think is because the same ResID is counted many times. I don't quite know how the aggregation works in QV but I would guess that the same ResID occurs under multiple Months thus affecting Quarter level, and the same thing happens also between Quarter and Year.

But thanks for providing something that works at least with limited level of grouping. I just need to put the different groupings to different pivots - that's a bit ugly but at least we now get figures that look sane. I need to do some double-checking, of course, but surely that looks promising!

Br,

Esa