Set analysis, in stead of If statement

rated by 0 users
Not Answered This post has 0 verified answers | 11 Replies | 5 Followers

Not Ranked
Points 35

Hi,

Please look at attached application. There's two tables loaded with no common field name, but I want the fields "CustomerId" and "cid" to be associated by a set analysis statement.

In the attached application, I have made a table chart with an expression with an If statement, and I want to make an expression with a set statement that calculates the same numbers in that table chart.

How do I write that set statement? Is it possible?

Thanks in advance!

Björn

  • | Post Points: 19

All Replies

Top 200 Contributor
Points 452

Hi,

I guess that if you use customer in set statement than you can't use customer in dimension. Because when you use set analysis the same expression is applied to all rows.

The second table is made with set statement. So you see Eddie Van Halen has the sum of all four customers, Robert Palmer also has the same sum and so on.

Milda

  • | Post Points: 7
Top 75 Contributor
Points 655

You can rename cid as CustomerId.  Then the tables will link.  Then you only need to sum(amount) in your table.

Using QV 9.0 SR5.

  • | Post Points: 7
Not Ranked
Points 35

Thank you jsomsen, but I don't want to do that. I want to make a set statement that calculates the if statement in my example.

  • | Post Points: 1
Not Ranked
Points 35

Thank you Milda, but that doesn't help me. I need an expression with a set statement that does "the same" as the If-statement-expression in my example. If it's possible...
Björn

  • | Post Points: 7
Top 25 Contributor
Points 4,158

Hello Björn,

Set Analysis allows you to indicate one field with values of other field. This is very slow though and performance is severely affected on more than certain number of records. Anyway, it looks like

Sum({< Field1 = Field2 >} Value)
. Depending on the amount of records involved, I'd keep using IF. Hope this helps.

Currently using QlikView 9.00 SR4/SR5

  • | Post Points: 1
Top 10 Contributor
Points 7,247

Bjorn,

I don't think it's possible using Set syntax. As John reminded me the other day,
http://community.qlikview.com/forums/t/25409.aspx
a set is analyzed once for the entire chart, not for each row.

You can simplfy the island approach by moving the IF into a hidden dimension like:
=if(CustomerId = cid, cid)

Then your expression columns do not not require the IF and may be written simply like:
=sum(amount)

See attached example.

-Rob

  • | Post Points: 13
Top 10 Contributor
Points 13,021

Rob Wunderlich:
You can simplfy the island approach by moving the IF into a hidden dimension like:
=if(CustomerId = cid, cid)

Then your expression columns do not not require the IF and may be written simply like:
=sum(amount)

I hadn't even thought to simplify like that.  Good idea.

But it made me curious - what is the relative performance of doing the IF in the dimension vs. doing it in the expression?  I honestly didn't expect a difference of significance with one dimension and one expression, figuring that QlikView would have to do the same work regardless.  So these results (50000 rows, 1000 distinct values on the matching columns) surprised me:

                        v8.5   v9.0 SR2
if() in dimension   30547 ms   27719 ms
if() in expression   3407 ms    3375 ms

Using an if() in the expression seems SIGNIFICANTLY faster than doing the exact same thing in the dimension.  I'm not sure why that would be.  I would have THOUGHT that with one dimension and one column, the performance would be the same.  Then I would expect that each additional expression would be "free" with the dimension approach, but not with the expression approach.  Testing that part...

                        v8.5   v9.0 SR2   3 expressions v8.5
if() in dimension   30547 ms   27719 ms   30641 ms
if() in expression   3407 ms    3375 ms    3406 ms

... it turns out that I'm completely wrong again.  Apparently, either way, almost all of the time is wrapped up in matching up my fields, and almost no time is actually spent doing the calculations.  That makes me think that QlikView is clever enough to realize that the conditions in my expressions are the same, and only do the looping match one time instead of three times.  I could be off base, though.

In any case, while I might start with the dimension approach, I'd suggest being willing to switch to the more complicated expression approach if performance seems to be a problem.  And even with the expression approach, doing an if() like this is simply going to be slow, even on some fairly small data sets.  It makes sense to me here, though.  In a sense, this was processing 50000 rows * 1000 distinct values for 50,000,000 combinations to plow through.

  • | Post Points: 7
Top 75 Contributor
Points 655

Hello all.  I am not sure why this has been bothering me, but I am trying to see if I am missing something.  Why would you not want to associate CustomerId and Cid in your script so that these tables were then linked? Then all you would need is a simple sum() in the chart.   It seems that placing that If ( ) statement would just add unneeded overhead to the analytic. 

This curious mind thanks you in advance,

JS

Using QV 9.0 SR5.

  • | Post Points: 7
Not Ranked
Points 35

Hi, thanks for your answer. It's due to work with an app with several tables with huge amounts of records. I'm trying out how to relate the tables in the QV layout without actually relating them in the load script. Just to see if it's possible. Else I have to come up with a DW solution before I load anything into QV.
The performance of the QV app is bad when having these huge tables related, and the load gets really heavy when I use mappings and joins to reduce the amount of tables, even when I do the most of the data preparement in SQL script.

Regards,

Björn

  • | Post Points: 1
Not Ranked
Points 35

John, Rob & everybody else,
thank you very much for your inputs.

I think I now know the answer to my question, that set can't replace if statements, not in this way anyhow.

Again, thank you all. I will now use this forum more in my future work with QlikView.

Regards,
Björn.

  • | Post Points: 7
Top 10 Contributor
Points 13,021

Well, the performance is likely to be much worse relating fields in your charts with if() than simply relating them in your data model.  I think your answer is what you already stated, "Else I have to come up with a DW solution before I load anything into QlikView".  Mind you, you don't need a full data warehouse before you can start on a QlikView application.  But I do think you want to solve this problem in your data model, not in your chart.

I'm not sure if this is applicable, but I'll outline our basic technique in our shop.  Our actual data loads from the databases are done in separate QVWs that create QVDs.  The user applications simply load from these QVDs.  When loading from a QVD, if performance is an issue, you need to make sure you're getting an "optimized load".  For an optimized load, your where expression can have no more than a single exists() and no other conditions, and you can't have any expressions in your list of fields (though you can rename them).  A typical approach for me might be:

[My Table]:
LOAD * INLINE [
Some Field
Value 1
Value 2
Value 3
];
INNER JOIN ([My Table])
LOAD
 "Key Field"
,"Some Field"
,"Some Other Field"
,"Field 3" as "Cost"
,"Field 4" as "Weight"
,"Field 5" as "Revenue"
FROM My_Table.qvd (QVD)
WHERE exists("Some Field")
;
INNER JOIN ([My Table])
LOAD * INLINE [
Some Other Field
A
B
C
];
LEFT JOIN ([My Table])
LOAD
 "Key Field"
,if("Weight">5,'Y','N') as "Heavy?"
,"Revenue" - "Cost" as "Margin"
RESIDENT [My Table]
;

It's a lot more trouble than how we might normally write this, but it can be worth it when load performance is a problem.  It may also be worth adding common "calculated fields" to the QVD to save the last step.  Also, if you need the same data in more than one user application, you don't want to have to maintain these calculations in more than one place.

  • | Post Points: 1
Page 1 of 1 (12 items) | RSS
Share
Feedback Form