Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
danielgargiulo
Partner - Creator
Partner - Creator

QlikView Experts: Control Chart Challenge

Control Chart Challenge

If you are a QlikView whizz it would be great to see if you can solve this particularly interesting challenge and how you go about. Good Luck!!!

Control Charts are used extensively in health care and the creation of these including highlighting rules has been nicely outlined by Erica in her blog (http://qlikfit.blogspot.co.uk/).

The interesting challenge we are currently facing is re calculating the average when there are 8 consecutive points above (or below) the average. Once this rule has been meet the average from the 9th point onwards needs to be re calculated to be based only on the data points going forward. This then needs to be taken further to check again for another 8 points above or below the new average, and so on.

We have spent countless hours trying to solve this and got nowhere!!!!!!!!!!!!

Please find attached:

  • ‘Data Average Changes.xlsx’: Data and an image illustrating the step up in average we need to create
  • ‘Control Chart Template - Jumping Average.qvw’ : Example QlikView application with the data and  a control chart that could be used as a starting point.

I look forward to seeing your solutions.

Dan

31 Replies
danielgargiulo
Partner - Creator
Partner - Creator
Author

HI Paul,

I can confirm that the excel spreadsheet is correct and also works for decreasing values. I am very impressed with your excel knowledge. This is definitely another step closer which is excellent.

Now how to achieve in QV is the next interesting challange?

Thanks

Dan

danielgargiulo
Partner - Creator
Partner - Creator
Author

Hi Erica,

Thanks again for having another crack at this however the average should be jumping up on 1/09/11 (assuming 9 consecutive points).

I am pretty sure in your example it is jumping up later because 'R2CUMA_abvCT' and 'R2CUMA_revCT' are linked to 'R2CUMA' and 'R2CUM' which are both linked to the overall 'Average'. We really need to be testing each data point against a cumulative average, a good example is Pauls excel document. I have reattached this with the original data for comparison purposes.

I think what we need is:

- a cumulative average field. Then:

- R2CUMA and R2CUM to be changed so at each data point they check the previous 9 data points against the current 'CumAvg' and come up with the number that meet the criteria. Your other logic should then flow from there. My understanding of how the above and below is far less than yours but hopefully it makes sense what i THINK we need to do.

From what i can tell the other functionality seems to be working.

thanks again,

dan

flipside
Partner - Specialist II
Partner - Specialist II


Hmm, I'm not entirely sure what you need. My script changes the ActiveAvg when it finds x consecutive values above (or x consecutive values below) its value. Once ActiveAvg is reset it then compares this new value with the average reset from the next row of data (so it will be that row's data value divided by 1) and so on until it finds another run of x consecutive values above (or below) its new value. In your example your data looks like it has an increasing trend so the average is continually increasing. Try putting in a sequence of some lower values mid way to see if it drags the average down.

What I would suggest however is that you build up a script starting as follows so you can be sure the logic is as you require it ...

1.     Ensure your data has a unique field in it. I have created the field UID. This is so we can guarantee that the FieldValue returns the rows without skipping any values.

2.     Build up a table called RunningAvgs row by row in a loop starting with totals only. Your first script might look like this ...

let runningSum = 0;

for r = 1 to NoOfRows('Data1')

     let runningSum = (runningSum + SubField(FieldValue('UID',$(r)),':',3));

     RunningAvgs:
     Load
          $(r) as ID,
          $(runningSum) as RunningSum
     autogenerate 1;

next r;

You can compare the calculated values with an excel spreadsheet to see if the logic is as you require, and moreover, that the rows are being called in the correct sequence.

3.     Add in logic to calculate the average. At this point you could just divide the RunningSum by r, but because you want to reset the average later on, you'll need to use a second counter, seq. You can't use r because it would break the loop sequence and/or cause an eternal loop. Remember to increase the seq counter within the loop.

4.     Once you have your RunningAvg, you can start comparing it to the Seed average values or Data values to determine whether it is above or below (1, -1), and then on the next loop check if there is a sequence of above/below values and increase the sequence counter.

It's easier to understand when you've built up the logic from the ground.

Hope this helps

flipside

Christian_Lauritzen
Partner - Creator II
Partner - Creator II

Daniel,

This is an interesting problem. There are great many clever approaches above. I could not help but to give it a try, so here is another variant. I used two counters, one that keeps track of consecutive points above average and one for points below. Then the Average formula checks these counters and updates the average when it is time.

The values is consistent with my manual check in Excel, but the average adjustments points differ from the picture in Excel. Maybe I misunderstood the logic. Anyhow, here is the solution.

Expression for Counter to track points above average (CounterH):

=if(Above(if(Data>Average,1,0))=1,mod(Above(CounterH),8)+1,0)

Expression for Counter to track points above average (CounterL):

=if(Above(if(Data<Average,1,0))=1,mod(Above(CounterL),8)+1,0)

The mod function secures that the counter is reset after 9 consecutive points. The Counters need to be set as invisible and also put on the right axis.

Average starts with an average of all data, and a change kicks in when any of the counters reach 8 (0-8 = 9 points) and creates an average of points below.

=if(RowNo()<=8, Avg(Total Data),
if(CounterH<8 AND CounterL<8, Above(Average,1),rangesum(Below(sum(Data),0,NoOfRows()-RowNo()))/ (NoOfRows()-RowNo()) ))  

Screenshot.png

Email: christian.lauritzen@b3.se
danielgargiulo
Partner - Creator
Partner - Creator
Author

Hi Christian,

Thanks for replying to the problem, it is great to see yet another approach. I am looking forward to seeing what one will finally crack it.

I think the reason it is different to the picture in excel is that we should be using a cumulative average to compare to each data point right from the start. I think currently it is only doing this after there has been 8 points above the TOTAL average.

Please note that we are also looking for 9 points above or below the average.

I look forward to hopefully seeing if a revised version is possible.

Thanks


Dan

Christian_Lauritzen
Partner - Creator II
Partner - Creator II

Just Googled Control Chart.  http://www.wikihow.com/Create-a-Control-Chart

The graph is out-of-control if any of the following are true:

  • Any point falls beyond the red zone (above or below the 3-sigma line).
  • 8 consecutive points fall on one side of the centerline.
  • 2 of 3 consecutive points fall within zone A.
  • 4 of 5 consecutive points fall within zone A and/or zone B.
  • 15 consecutive points are within Zone C.
  • 8 consecutive points not in zone C.

Zone A is +/- 1 std deviation, Zone B is 2 and Zone C is 3 std dev.

Do you wish all these rules to apply when re-adjusting the mean?

Email: christian.lauritzen@b3.se
danielgargiulo
Partner - Creator
Partner - Creator
Author

Hi Christian,

At this stage we are only focused on the rule of the 9 consecutive points above the cumulative average as the trigger to re-adjust the mean. Once this has been triggered the cumulative average would begin again with the search for another 9 points above or below the mean. Does this makes sense? If you see Paul's Excel example above this provides a good overview of the desired result.

Thanks again,

Dan

Christian_Lauritzen
Partner - Creator II
Partner - Creator II

Daniel,

I gave this some more thought (honestly, it bugged me a lot!) and I am see this challenge is a bit trickier than it seemed to be at first glance. It has a built in trap. I'll explain how.

The rule:

Generate an average for a section. The section ends when 9 consecutive points are above or below the section average. The average should be based on all points in the section, excluding the 9th point. The next section begins at the 9th consecutive point.

Problem:

The rule is a sort of recursive logic, where you have to iterate the size of each section until the end of data or until 9 consecutive points. With this rule, there are data sets cannot be resolved due to that logic loops that occur.  With your specific data set you do not encounter the loop situation. You do however with the set below.

Example:

The data set below triggers the logic loop. As the first section is expanded to find the 9 points in a row that are below or above average, no such pattern is found until you include point 21 - a very high value.

Suddenly, the first 9 points in the section are below the cumulated average.  The section should then end at point 8. You need to recalculate the average up until point 8. However, now you don't find any 8 consecutive points below or above average anymore. Not until the section once again is expanded until point 21. And so it goes…

You have a logical loop that never ends.

Since there are control diagrams with changing averages out there, they should have a different set of rules. 

Sorry to not be able to help you more.

/Christian

Data Set:

Table.png

Email: christian.lauritzen@b3.se
danielgargiulo
Partner - Creator
Partner - Creator
Author

Hi Christian,

Thanks for taking another look at this and apologies for the delay in responding, i needed a couple of days not looking at this to keep my sanity!!!

I have taken your data set and put it into Pauls excel solution (see attached), it seems to calculate as expected in excel. Note the 21st point does not actually trigger the change as the data point (100000000) is still above the average even though all others are below. It does get triggered on line 30 though when (100000000) is no longer compares part of the 9 consecutive points.

You noted "The average should be based on all points in the section, excluding the 9th point."  Please note the average actually includes the 9th consecutive point.


Based on the above do you think this is something you could take another look at? I feel like your initial approach was close and I think my notes here hopefully mean recursive logic is no longer an issue. 

Thanks

Dan

danielgargiulo
Partner - Creator
Partner - Creator
Author

POSSIBLE SOLUTION

Hi All,

I think we may have a solution, please see QVW attached. This basically takes the logic of Paul's spreadsheet and puts it into a QlikView table/chart. If you have a chance it would be great if people could try and pick some holes in the solution to check it holds up.

I am also intrigued to see if people have any suggestions for:

1. Is there a better way to Count the number of points that are above or below the CumAvg instead of all the Pos. Line1 and Neg Line 1? If so we may be able to allow users to choose the number of consecutive points above ro below the average with a variable.

2. How to Add further rules  as per Ericas original Control Chart.

  a) Highlight the points that make up values that are above/below the average to trigger the step change.

  b) Highlight series of points that are increasing and decreasing.

My fingers are crossed this is going to hold up under cross examination!!!! Thank you all again for soooooo much help. This would have got no where with out so many ideas bouncing around. I had all but given up hope that this was possible.

Erica, if this does end up a solution then please feel free to write it up in your blog.

Thanks again,

Dan