Loop For ... NEXT to compare values not returning ... - Qlik Community

Delphines · ‎2023-10-13

Hi everyone,

I have 2 fields I wish to compare to attribute to FIELD1 a certaine value from FIELD2. I used a For ... Next to do this comparison, but I dont have the expected results, and I don't understand why. Here is my code:

liste_range:
	LOAD RANGE2 Resident temp Order By RANGE2;
	DROP Table temp;
	
NoConcatenate // comparison table
ref_range:
	LOAD REF_RANGE Resident temp2 Order by REF_RANGE;
	DROP Table temp2;
	
FOR j = 0 TO NoOfRows('liste_range') - 1
LET CurrentValue = Peek('RANGE2', j, 'liste_range'); 
LET MatchingValue = CurrentValue;
	FOR k = 0 TO NoOfRows('ref_range') - 1
	LET ReferenceValue = Peek('REF_RANGE', k, 'REF_range');  			 
        IF ReferenceValue <> CurrentValue and WildMatch(ReferenceValue, '*' & CurrentValue & '*') THEN 
	       LET MatchingValue = ReferenceValue; 
	       EXIT FOR;
	    ENDIF
	 NEXT k
	 MappingTable:
	 LOAD
	        '$(CurrentValue)' as RANGE2,
	        '$(MatchingValue)' as RESULT
	 AUTOGENERATE 1;
NEXT j

here are some data examples:

RANGE2

AOXI

AOXITIVE

AVAOXITIVE

OXITIVE

REF_RANGE

AOXITIVE

Output:

RANGE2	RESULT	EXPECTED
AOXI	AOXI	AOXITIVE (AOXI part of AOXITIVE)
AOXITIVE	AOXITIVE	AOXITIVE
AVAOXITIVE	AVAOXITIVE	AVAOXITIVE (not part of AOXITIVE)
OXITIVE	OXITIVE	AOXITIVE (OXITIVE part of AOXITIVE)

thanks for your suggestions!

marcus_sommer · ‎2023-10-13

Many of your variable-calls are applied without a $-sign expansion and may not always be treated as excepted especially the ones within the wildmatch() which might be also replaced by a LIKE comparing. Further I'm not sure if the AND linking of the conditions is the right one. Also that the second loop will be ended after the first match. Another question goes to the mapping-table creation in the loop which is often not working with an auto-concatenation approach. A look on the detailed results of the document-log should answer most these questions.

Beside this if your final data-sets are a bit larger like a few millions records against a few hundred thousands records such loop-approach would be very slow and may take many hours.

I'm not sure if I had comprehend your data and aim correct but I would tend to another approach and loading RANGE2 as (sorted) mapping table and using mapsubstring() against REF_RANGE and inserting any counter or index or maybe just <<<TRUE>>> which is afterwards extracted or counted/evaluated. It's a bit reverse to your approach but may be working, too.

StarinieriG · ‎2023-10-13

Hello,

there's an error in a variable

LET ReferenceValue = Peek('REF_RANGE', k, 'REF_range');

It's 'ref_range', not 'REF_range'

I tried with this correction and it seems to work

Delphines · ‎2023-10-13

hi Starinieri,

my mistake in copying the code, it is ref_range indeed.

Delphines · ‎2023-10-13

hi Marcus,

not sure I get it all 🙂 I'm self-learning according to what I need to do, so I might not know everything that one would suppose by me making a loop.

@marcus_sommer wrote:

Many of your variable-calls are applied without a $-sign expansion and may not always be treated as excepted

how od you suggest I add the expansion?

@marcus_sommer wrote:

the wildmatch() which might be also replaced by a LIKE comparing.

so If CurrentValue like '*ReferenceValue*' Then let MatchingValue=ReferenceValue ?

@marcus_sommer wrote:

I'm not sure if the AND linking of the conditions is the right one.

I don't want to compare if they are identical, but maybe I'm wrong?

@marcus_sommer wrote:

Also that the second loop will be ended after the first match.

Correct. I don't know how to manage if have more than 1 match, meaning, which match might be the right one? I need to end with 1 unique associated value. if you have an idea on how to manage several matches, don't hesitate to share!

@marcus_sommer wrote:

Another question goes to the mapping-table creation in the loop which is often not working with an auto-concatenation approach.

so actually this loop is part of a 3rd one: I have another field, let say FIELD1. for each value of FIELD1, there are several RANGE2 values linked. in order to alleviate the operation (and have also better accuracy), I do the loop I indicated per each value of FIELD1. the code begins with:

ORIGINE:
LOAD FIELD1, WORKING_FIELDS FROM DATA_SOURCE;

FINAL:
LOAD null() AS RESULT
AutoGenerate(0);

for i=1 to fieldvaluecount('FIELD1')
TRACE la valeur de i est $(i);
let a = FieldValue('FIELD1',$(i));

MY_WORKING_TABLE:
LOAD * RESIDENT ORIGIN WHERE FIELD1 = '$(a)';

then I do some work to create different temp tables, then after the above code, I have:

concatenate(FINAL):
LOAD * RESIDENT MY_WORKING_TABLE;
LEFT JOIN 
LOAD RANGE2, RESULT RESIDENT MappingTable;
drop Table MY_WORKING_TABLE;
NEXT i

I stored each step to see where it could go wrong, and the mapping table seems to increment OK.

@marcus_sommer wrote:

Beside this if your final data-sets are a bit larger like a few millions records against a few hundred thousands records such loop-approach would be very slow and may take many hours.

it's more a few thousands, and I'm doing my test on ~25k records. for these 25k is not really long, do you think it will be hours for the few thousands?

@marcus_sommer wrote:

I'm not sure if I had comprehend your data and aim correct but I would tend to another approach and loading RANGE2 as (sorted) mapping table and using mapsubstring() against REF_RANGE and inserting any counter or index or maybe just <<<TRUE>>> which is afterwards extracted or counted/evaluated. It's a bit reverse to your approach but may be working, too.

there is no link currently between RANGE2 and REF_RANGE, that's what I trying to create. as far as I know, there has to be an existing link between the 2 fields to apply a mapsubstring, has it not? and the mapsubstring will look at exact match, won't it? not if one is included in the other?

in the meantime, I put my 2 fields RANGE2 and REF_RANGE into 1 table and run the loop on this table, and it works BUT only when i = 1.

Does it make more sense to you?

thanks again for your help!

Delphines · ‎2023-10-13

ok so I think this bit:

WHERE FIELD1 = '$(a)';

is not working properly, but I don't understand why.

I have this:

MY_WORKING_TABLE:
LOAD * RESIDENT ORIGIN WHERE FIELD1 = '$(a)';

which is working but then I have some temporary tables, and one of these temporary tables is looking like this:

temp2:
LOAD FIELD1, REF_RANGE FROM 2_DATA_SOURCE WHERE FIELD1 = '$(a)';

and here that doesn't work. when i = 1, it's taking only FIELD1 = value1, but when i = 2 it's taking FIELD1 = value 1 and value 2, and so on. what's wrong here?

thanks!

marcus_sommer · ‎2023-10-16

Personally I suggest to make the code as clear and explicit as possible and this means to apply always a $-sign expansion to show that a variable is called. In this way they will be highlighted which simplifies the reading and couldn't be mistaken with fields/tables or routines. Further you will need to ensure in beforehand if the content are numbers or strings which minimizes later any troubleshooting.

Instead of the function wildmatch() a comparing might be done with:

'$(var)' LIKE '*my Value*'

The hint with AND is a more generic one because it's sometimes not always easy to define the wanted logically dependencies between data with AND and/or OR respectively wit NOT the reverse check.

I did understand your approach as check from all values against all values and not as a far more simple single TRUE/FALSE match and therefore an ending after the first match wouldn't be sufficient.

Mapsubstring() replaced each found substring with the specified mapping-value. In opposite to the applymap() which only takes the first match could be multiple substrings replaced. If no direct replacement is intended like in scenarios in which certain language specific special chars are replaced else matches should be identified and/or there are overlapping values/matches you could insert additionally chars and/or counter/index values like hinted with the above:

<<<TRUE>>>

which enables you to count afterwards the number of <<< with something like substringcount() and/or textbetween() is taken to extract the values or a replace() to remove them again. It depends on the detailed requirement which approaches are most suitable - but nearly everything will be possible.

Mapping is a very fast solution and even mapsubstring() and multiple afterwards check/replace/cleaning/filter logic in n resident loads are usually very performant. Nested outside-loops on tables and grabbing there table-values per peek() and doing n other things with it must be slow compared against a classical load because the overhead will increase significantly - jumping from iteration to iteration and assigning each time n variables and initializing each time a load-statement - each one is an extra measurement which is communicated between Qlik and the OS and will have a certain delaying of n milliseconds. By smaller data-sets is often not noticeable but millions of n milliseconds may end in several hours.

In regard to the described possibilities of the mapsubstring() I suggest you creates a small dummy data-set with a few inline-records to comprehend the logic and to play with various methods to insert matching/counter values and how they could be extracted again.

Delphines · ‎2023-10-16

hi Marcus,

thanks for your answer. I used ApplyMap and MapSubstring on another project: my understanding, again, was that I need to know beforehand what I would like to change, which is not the case here. The whole purpose of my loop is precisely to reduce the number of different values to create a basis to then develop a mapping table (so I have less entries to manually reclassify basically). I'm not trying to replace some characters in my original field but create a new field that would be a sort of synthesis/reduction of the 1st one. so to go from here:

RANGE2

AOXI

AOXITIVE

AVAOXITIVE

OXITIVE

to here:

RESULT

AOXITIVE

AVAOXITIVE

AOXITIVE

and then instead of having to classify 4 elements, I just have 2. Knowing that if I do it a 2nd time, in my example above AOXITIVE being included into AVAOXITIVE, actually I will have just 1 entry to classify instead of 4.

my plan is then, once I have my reduced list and my manual reclassification, to indeed apply a Map to replace the original content.

Kind regards,

Delphine

cwolf · ‎2023-10-16

Greatest problem: You think you're looping through sorted tables.

A load ... resident ... order by ... only applies to the load! There is no re-sorting in the memory! If you want to achieve this, the fields must be given a new name.

You don't need to loop through tables to iterate through field values. A loop over the field values is sufficient.

Variables with text as content should always be accessed with '$(...)'.

Try following:

liste_range:
LOAD distinct RANGE2 as RANGE2_sorted Resident temp Order By RANGE2;

DROP Table temp;

ref_range:
LOAD distinct REF_RANGE as REF_RANGE_sorted Resident temp2 Order by REF_RANGE;

DROP Table temp2;
	
for each CurrentValue in FieldValueList('RANGE2_sorted')
	set MatchingValue = '$(CurrentValue)';
	for each ReferenceValue in FieldValueList('REF_RANGE_sorted')
        if '$(ReferenceValue)' <> '$(CurrentValue)' and WildMatch('$(ReferenceValue)', '*$(CurrentValue)*') then 
	       set MatchingValue = '$(ReferenceValue)'; 
	       exit for;
	    end if
	next
	MappingTable:
	LOAD
	'$(CurrentValue)' as RANGE2,
	'$(MatchingValue)' as RESULT
	AUTOGENERATE 1;
next

marcus_sommer · ‎2023-10-17

My suggestion is aimed to create a table-result - maybe already within the target-table or in a separate table which might then be used for an association and/or any mapping/join/filter approaches - nothing is excluded else everything is possible.

An alternatively to the above might be also just to join both fields as different fields to get a Cartesian result and then applying a wildmatch() of both against each other.

Further depending on the data you may not only get the wanted matches else wrong ones by too short string-parts and/or unwanted overlapping of different values. This means you may further need more or less manually efforts to control the results and adjusting them and/or filling the missing ones. If this is the case and the resulting matching-rate rather low it might be more practicably to make the entire job manually - even if this sounds stupid and would be surely an ugly task. On the other side you may consider to increase the technically stuff by replacing the above mentioned methods and/or adding an approach like: LevenshteinDist - script and chart function | Qlik Cloud Help to it.

Loop For ... NEXT to compare values not returning expected results

Application Development

Scripting