Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Crichter141
Contributor III
Contributor III

Qlik Cloud: Group By: Out of Memory Issues

I've searched the forums and I've not found something I think applies.  So here goes. 

I have a rather table - 59,000,000 rows.  I've narrowed it down as far as I can, but still large 35,000,000 rows and growing.  The code works fine onprem, but in the Cloud it breaks down in the load script with an out of memory error in the code below.

Any suggestions on how to do a better group by?  I thought I was close with a concatenated field key but it's still breaking.  

 

Data Set:
NOCONCATENATE
LOAD
"ACCOUNT",
"DATE",
"COMPANY",
"NUMBER",
"DOCUMENT NUMBER",
"TYPE CODE",
"GROUP",
"CURRENCY",
"NARRATIVE",
"SOURCE",
"FLAG",
SUM("AMOUNT") AS "AMOUNT 1",
SUM("AMOUNT USD") AS "AMOUNT 2"
RESIDENT TEMP
GROUP BY
"ACCOUNT",
"DATE",
"COMPANY",
"NUMBER",
"DOCUMENT NUMBER",
"TYPE CODE",
"GROUP",
"CURRENCY",
"NARRATIVE",
"SOURCE",
"FLAG"
;
DROP TABLE TEMP;

Labels (1)
1 Solution

Accepted Solutions
Anil_Babu_Samineni

@Crichter141 This sounds you already loaded raw data to this directory "TEMP", So can't you try Sum() and Group By in "TEMP" table rather again in "Data Set"?

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful

View solution in original post

5 Replies
Anil_Babu_Samineni

@Crichter141 This sounds you already loaded raw data to this directory "TEMP", So can't you try Sum() and Group By in "TEMP" table rather again in "Data Set"?

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
Crichter141
Contributor III
Contributor III
Author

@Anil_Babu_Samineni 

I did try it in the Temp table.  I believe the original developer did it this way because he put a rather long IF statement with a match expression for one of the fields in that Temp table and probably didn't want to make the group by more complicated by it.  It was something like this (I had to cut out the names, proprietary info) but you get the idea:

If
(
Match
(
TRIM(SOURCE),
'VENDOR1',
'VENDOR2',
'VENDOR3',
'VENDOR4',
'VENDOR5',
'VENDOR6',
'VENDOR7',
'VENDOR8',
'VENDOR9',
'VENDOR10',
'VENDOR11',
'VENDOR12',
'VENDOR13',
'VENDOR14',
'VENDOR15',
'VENDOR16',
'VENDOR17',
'VENDOR18',
'VENDOR19',
'VENDOR20',
'VENDOR21',
'VENDOR22',
'VENDOR23',
'VENDOR24',
'VENDOR25'
) <> 0
AND
Match
(
TRIM(TYPE CODE),
'TYPE CODE 1',
'TYPE CODE 2',
'TYPE CODE 3',
'TYPE CODE 4',
'TYPE CODE 5'
) = 0
AND ACCOUNT <> '####'
,1
,0
) AS [FLAG]

Anil_Babu_Samineni

@Crichter141 Will you be able to do all the transformation in the backend (Like your DB) and load the final table or View? Like this huge data, it is recommended in the backend.

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
marcus_sommer

Probably the simplest approach to reduce the RAM consumption would be not to load the group by from a resident table else from a qvd - means storing the tmp as qvd and then dropping it.

Further you may try to reduce the group by to the essential fields which might be just Account or Account & '|' & Date as KEY and the other information are later added by mappings or the aggregation-table is kept as a dimension-table or ...

Beside this you may completely skip the aggregation and keeping all records within the data-model without having mandatory a significantly impact on the final RAM consumption and the UI performance.

Crichter141
Contributor III
Contributor III
Author

@Anil_Babu_Samineni actually your first post got me thinking.  I was able to do what you suggested with the group by in the temp table.  Thank you so much!