Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
richardouellett
Creator
Creator

Permutations Math

I have created a field called 'Sequence'  a 7 character (Upper Case) text field that is made up from the sequence (ascending order) of 7 timestamps that are captured  for each record transaction.  The timestamps are:

O = Ordered product

G = List product

L =  Label Product

C = Collect

A = Arrive Product

P = Prepare Product

R = Report on Product

Depending on the sequence order they are performed a typical Sequence field could look like OGLCAPR or OLGCAPR.  My analysis thus far is displayed below:

I now want to perform this time analysis above based on distinct combinations of 2 , i.e. Permutations without repetition (n=7, r=2)

List has 42 maximum entries.

OL OG OC OA OP OR LO LG LC LA LP LR GO GL GC GA GP GR CO CL CG CA CP CR AO AL AG AC AP AR PO PL PG PC PA PR RO RL RG RC RA RP

If I take the same table above and apply the MID() function to the Sequence filed I would get the results below.

Challenge/Question?

I now want to perform an analysis similar to the above on unique timestamp permutations for example 'OG' occurred 646 + 23 + 5 =  474 times.and PR occurred 905 times. (In the above example I want to analyze the performance of OG regardless of it being performed in step 1 or step 2.)  The new analysis would consists of unique permutations, e.g., OG, PR, LO with corresponding Count, % and throughput statistics.  Hope this all makes sense.  How can this be accomplished in QlikView?

1 Solution

Accepted Solutions
swuehl
MVP
MVP

If I've understood correctly, I think your interval calculations can be simplified if you perform them same place you calculate your sequence:

DATA:

LOAD

  recno() as ID,

  Timestamp(today() + RAND()) as O,

  Timestamp(today() + RAND()) as G,

  Timestamp(today() + RAND()) as L,

  Timestamp(today() + RAND()) as C,

  Timestamp(today() + RAND()) as A,

  Timestamp(today() + RAND()) as P,

  Timestamp(today() + RAND()) as R

AutoGenerate 20;

X:

CrossTable (Field, Time) LOAD * RESIDENT DATA;

LEFT JOIN (DATA)

LOAD

  ID,

  Concat(Field, '',Time) as Sequence,

  (min(Time,7)-min(Time,1))*24 as i1_7,

  (min(Time,2)-min(Time,1))*24 as i1_2,

  (min(Time,3)-min(Time,2))*24 as i2_3,

  (min(Time,4)-min(Time,3))*24 as i3_4,

  (min(Time,5)-min(Time,4))*24 as i4_5,

  (min(Time,6)-min(Time,5))*24 as i5_6,

  (min(Time,7)-min(Time,6))*24 as i6_7  

Resident X

Group BY ID;

drop table X;

LOAD Sequence, mid(Sequence,IterNo(),2) as Permut, 1 as Count

RESIDENT DATA

WHILE iterno() <= 6;

I also added the link table to your permutations (and not shown here, an INLINE table of all permutations, see attached QVW).

Does this allow to perform the analysis you want to do?

View solution in original post

19 Replies
sunny_talwar

Wow this is some high end stuff you are doing here (at least seems like that). To tell you the truth I don't understand the question well, but would love to know what you have already done (see it in a qvw) and then may be try to understand what you are looking to do and may be propose a solution ultimately (there is no certainty that I will be able to provide a solution, but I would still love to give it a try )

If we are in agreement and if you can share the qvw of what you have done, we can get the ball rolling.

Best,

Sunny

swuehl
MVP
MVP

Maybe like

=COUNT({<Sequence = {"*OG*"}>} Sequence)

resp.

=COUNT({<Sequence = {"*PR*"}>} Sequence)


or

COUNT( IF(Sequence LIKE '*OG*', Sequence))

richardouellett
Creator
Creator
Author

Thank you for this suggestion. This is a large database with many millions of records.  The goal is to find a way to determine which of the 42 possible permutations had activity for a give query and then do the analysis.  In the above example I believe there were only 18 of 42 permutations that had activity.

swuehl
MVP
MVP

You do already have a field with your permutations, right? Let's assume it's called PERMUT.

You can create a chart with PERMUT as dimension and an expression:

=COUNT(IF (Sequence LIKE '*' & PERMUT & '*', Sequence))

How does this perform in your setting? There are some alternatives for the match in the front end, you could also consider creating a link table between Sequence field and its permutations in the script.

richardouellett
Creator
Creator
Author

The only field I have to work with is the 7 character Sequence field.  Each successive 2 letter combination from this field provides the permutation equivalent, e.g., Mid(Sequence,1,2).  I tried to show this in the last image above.

richardouellett
Creator
Creator
Author

Hit return too fast.  I like your second suggestion, can you provide a bit more detail?

swuehl
MVP
MVP

Richard Ouellette wrote:

The only field I have to work with is the 7 character Sequence field.  Each successive 2 letter combination from this field provides the permutation equivalent, e.g., Mid(Sequence,1,2).  I tried to show this in the last image above.

Well, I think you also have a list of permuations:

List has 42 maximum entries.

OL OG OC OA OP OR LO LG LC LA LP LR GO GL GC GA GP GR CO CL CG CA CP CR AO AL AG AC AP AR PO PL PG PC PA PR RO RL RG RC RA RP

Just create an INLINE LOAD for the list and you're set.

And you liked my second suggestion, which one?

edit:

This is how you can create your link table:

LOAD Sequence, mid(Sequence,iterno(),2) as Permut

INLINE [

Sequence

OGLCAPR

OLGCAPR

]

while IterNo() <= 6;

Instead of the INLINE, you can do a RESIDENT LOAD of your Sequence table.

richardouellett
Creator
Creator
Author

Swuehl

Below is some sample code that you graciously assisted me with last week.  As you can see I am calculating the time intervals as load time due to the complexity of the calculations and application impact.  I would like to now perform the time durations based on the various time stamp permutations based on your above recommendation but am at a bit of a loss with how to proceed.  Can you assist?  Also my interval calculation is long and complex... is there an easier way to accomplish this? 

  DATA: 

    LOAD 

      recno() as ID, 

      Timestamp(today() + RAND()) as O, 

      Timestamp(today() + RAND()) as G, 

      Timestamp(today() + RAND()) as L, 

      Timestamp(today() + RAND()) as C, 

      Timestamp(today() + RAND()) as A, 

      Timestamp(today() + RAND()) as P,

      Timestamp(today() + RAND()) as R 

    AutoGenerate 20; 

     

    X: 

    CrossTable (Field, Time) LOAD * RESIDENT DATA; 

     

    LEFT JOIN (DATA) 

    LOAD 

      ID, 

      Concat(Field, '',Time) as Sequence

    Resident X 

    Group BY ID; 

     

    drop table X; 

    Drop fields ID;

Store DATA into Data.qvd (qvd);

Drop Table DATA;

DATA:

Load O,

G,

L,

C,

A,

P,

R,

Sequence,

Interval((If( MID(Sequence,7,1)='O',O, If( MID(Sequence,7,1)='L',L, If( MID(Sequence,7,1)='G',G, If( MID(Sequence,7,1)='C',C,

     If( MID(Sequence,7,1)='A',A, If( MID(Sequence,7,1)='P',P,R))))))) - (If( MID(Sequence,1,1)='O',O, If( MID(Sequence,1,1)='L',L,

     If( MID(Sequence,1,1)='G',G, If( MID(Sequence,1,1)='C',C, If( MID(Sequence,1,1)='A',A, If( MID(Sequence,1,1)='P',P,R)))))))

     , 'hh:mm')*24 as i1_7,

     Interval((If( MID(Sequence,2,1)='O',O, If( MID(Sequence,2,1)='L',L, If( MID(Sequence,2,1)='G',G, If( MID(Sequence,2,1)='C',C,

     If( MID(Sequence,2,1)='A',A, If( MID(Sequence,2,1)='P',P,R))))))) - (If( MID(Sequence,1,1)='O',O, If( MID(Sequence,1,1)='L',L,

     If( MID(Sequence,1,1)='G',G,  If( MID(Sequence,1,1)='C',C, If( MID(Sequence,1,1)='A',A, If( MID(Sequence,1,1)='P',P,R)))))))

     , 'hh:mm')*24 as i1_2,

     Interval((If( MID(Sequence,3,1)='O',O, If( MID(Sequence,3,1)='L',L, If( MID(Sequence,3,1)='G',G, If( MID(Sequence,3,1)='C',C,

     If( MID(Sequence,3,1)='A',A, If( MID(Sequence,3,1)='P',P,R))))))) - (If( MID(Sequence,2,1)='O',O, If( MID(Sequence,2,1)='L',L,

     If( MID(Sequence,2,1)='G',G,  If( MID(Sequence,2,1)='C',C, If( MID(Sequence,2,1)='A',A, If( MID(Sequence,2,1)='P',P,R)))))))

     , 'hh:mm')*24 as i2_3,

      Interval((If( MID(Sequence,4,1)='O',O, If( MID(Sequence,4,1)='L',L, If( MID(Sequence,4,1)='G',G, If( MID(Sequence,4,1)='C',C,

     If( MID(Sequence,4,1)='A',A, If( MID(Sequence,4,1)='P',P,R))))))) - (If( MID(Sequence,3,1)='O',O, If( MID(Sequence,3,1)='L',L,

     If( MID(Sequence,3,1)='G',G,  If( MID(Sequence,3,1)='C',C, If( MID(Sequence,3,1)='A',A, If( MID(Sequence,3,1)='P',P,R)))))))

     , 'hh:mm')*24 as i3_4,

     Interval((If( MID(Sequence,5,1)='O',O, If( MID(Sequence,5,1)='L',L, If( MID(Sequence,5,1)='G',G, If( MID(Sequence,5,1)='C',C,

     If( MID(Sequence,5,1)='A',A, If( MID(Sequence,5,1)='P',P,R))))))) - (If( MID(Sequence,4,1)='O',O, If( MID(Sequence,4,1)='L',L,

     If( MID(Sequence,4,1)='G',G,  If( MID(Sequence,4,1)='C',C, If( MID(Sequence,4,1)='A',A, If( MID(Sequence,4,1)='P',P,R)))))))

     , 'hh:mm')*24 as i4_5,

     Interval((If( MID(Sequence,6,1)='O',O, If( MID(Sequence,6,1)='L',L, If( MID(Sequence,6,1)='G',G, If( MID(Sequence,6,1)='C',C,

     If( MID(Sequence,6,1)='A',A, If( MID(Sequence,6,1)='P',P,R))))))) - (If( MID(Sequence,5,1)='O',O, If( MID(Sequence,5,1)='L',L,

     If( MID(Sequence,5,1)='G',G,  If( MID(Sequence,5,1)='C',C, If( MID(Sequence,5,1)='A',A, If( MID(Sequence,5,1)='P',P,R)))))))

     , 'hh:mm')*24 as i5_6,

     Interval((If( MID(Sequence,7,1)='O',O, If( MID(Sequence,7,1)='L',L, If( MID(Sequence,7,1)='G',G, If( MID(Sequence,7,1)='C',C,

     If( MID(Sequence,7,1)='A',A, If( MID(Sequence,7,1)='P',P,R))))))) - (If( MID(Sequence,6,1)='O',O, If( MID(Sequence,6,1)='L',L,

     If( MID(Sequence,6,1)='G',G,  If( MID(Sequence,6,1)='C',C, If( MID(Sequence,6,1)='A',A, If( MID(Sequence,6,1)='P',P,R)))))))

     , 'hh:mm')*24 as i6_7

FROM

Data.qvd (QVD);

swuehl
MVP
MVP

If I've understood correctly, I think your interval calculations can be simplified if you perform them same place you calculate your sequence:

DATA:

LOAD

  recno() as ID,

  Timestamp(today() + RAND()) as O,

  Timestamp(today() + RAND()) as G,

  Timestamp(today() + RAND()) as L,

  Timestamp(today() + RAND()) as C,

  Timestamp(today() + RAND()) as A,

  Timestamp(today() + RAND()) as P,

  Timestamp(today() + RAND()) as R

AutoGenerate 20;

X:

CrossTable (Field, Time) LOAD * RESIDENT DATA;

LEFT JOIN (DATA)

LOAD

  ID,

  Concat(Field, '',Time) as Sequence,

  (min(Time,7)-min(Time,1))*24 as i1_7,

  (min(Time,2)-min(Time,1))*24 as i1_2,

  (min(Time,3)-min(Time,2))*24 as i2_3,

  (min(Time,4)-min(Time,3))*24 as i3_4,

  (min(Time,5)-min(Time,4))*24 as i4_5,

  (min(Time,6)-min(Time,5))*24 as i5_6,

  (min(Time,7)-min(Time,6))*24 as i6_7  

Resident X

Group BY ID;

drop table X;

LOAD Sequence, mid(Sequence,IterNo(),2) as Permut, 1 as Count

RESIDENT DATA

WHILE iterno() <= 6;

I also added the link table to your permutations (and not shown here, an INLINE table of all permutations, see attached QVW).

Does this allow to perform the analysis you want to do?