Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a sample data set attached in the spreadsheet. The two sheets have "Input" and "Expected Output" data.
The Input data has 48 records each (one for each 'ts') for 3 different CG_Name's (BEL, BGS and TW).
Output:
For each 'ts' value, I want to create a new CG_Name = 'NAT' and calculate throughput_in, volume_in, SIOs etc., as the difference of value between the 3 CG_Name's as below.
BGS throughput_in - BEL throughput_in - TW throughput_in.
(- means subtraction).
I would really appreciate if you can help me with the approach / write the code for this.
Please note that the original data source is a database and this calculation needs to be done in Qlikview backend.
Thanks in advance,
Shashank
How about:
Data:
LOAD CG_Name,
est_date,
week_start,
ts,
throughput_in,
throughput_out,
volume_in,
volume_out,
SIOs
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Input);
NATData:
LOAD
'NAT' as CG_Name,
ts,
sum(if(CG_Name='BGS',throughput_in, -throughput_in)) as throughput_in,
sum(if(CG_Name='BGS',throughput_out, -throughput_out)) as throughput_out,
sum(if(CG_Name='BGS',volume_in, -volume_in)) as volume_in,
sum(if(CG_Name='BGS',volume_out, -volume_out)) as volume_out,
sum(if(CG_Name='BGS',SIOs, -throughput_out)) as SIOs
Resident Data
Group By ts
;
// Grab the other Dimension values if desired
JOIN (NATData)
LOAD
ts,
est_date,
week_start
Resident Data;
DROP TABLE Data;
-Rob
Dear,
Kindly find the attached Application.
Maybe some experts will help you in a better method than me.
I would denormalize the database:
Data:
LOAD
ts, est_date, throughput_in AS BEL_throughput_in, .... FROM .... WHERE CG_Name = 'BEL';
JOIN (Data) LOAD
ts, est_date, throughput_in AS BGS_throughput_in, .... FROM .... WHERE CG_Name = 'BGS';
....
Doing so, you would have all values in one line in the database, where it will be easy to calculate with.
Alternatively you may load the data sorted by ts and CG_Name and then do the calculation on the last record (e.g. TW):
LOAD
*,
IF (CG_Name = 'TW', PREVIOUS(throughput_in) - PREVIOUS(PREVIOUS(throughput_in)) - throughput_in) AS NAT_throughput,
.....;
LOAD * FROM ..... ORDER BY ts, CG_Name
edit: apostrophs removed (PREVIOUS does not require. PEEK might be used, but then aprostrophs are needed)
Hi Peter,
Thank you for your response. The first approach is simple and easy.
However, I have to do a similar calculation for multiple cases and I might end up writing a very long piece of code to get the desired result.
If you can explain the second method in detail, that would be very helpful.
Thanks,
Shashank
How about:
Data:
LOAD CG_Name,
est_date,
week_start,
ts,
throughput_in,
throughput_out,
volume_in,
volume_out,
SIOs
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Input);
NATData:
LOAD
'NAT' as CG_Name,
ts,
sum(if(CG_Name='BGS',throughput_in, -throughput_in)) as throughput_in,
sum(if(CG_Name='BGS',throughput_out, -throughput_out)) as throughput_out,
sum(if(CG_Name='BGS',volume_in, -volume_in)) as volume_in,
sum(if(CG_Name='BGS',volume_out, -volume_out)) as volume_out,
sum(if(CG_Name='BGS',SIOs, -throughput_out)) as SIOs
Resident Data
Group By ts
;
// Grab the other Dimension values if desired
JOIN (NATData)
LOAD
ts,
est_date,
week_start
Resident Data;
DROP TABLE Data;
-Rob
Thank you everyone.
All the answers are correct, but I have marked Rob's answer as the Correct Answer because of ease of writing the code and also having to avoid Joins.
As a refinement, you could do this in a single step I did it in stages because I was validating as I went.
NATData:
LOAD
'NAT' as CG_Name,
ts,
sum(if(CG_Name='BGS',throughput_in, -throughput_in)) as throughput_in,
sum(if(CG_Name='BGS',throughput_out, -throughput_out)) as throughput_out,
sum(if(CG_Name='BGS',volume_in, -volume_in)) as volume_in,
sum(if(CG_Name='BGS',volume_out, -volume_out)) as volume_out,
sum(if(CG_Name='BGS',SIOs, -throughput_out)) as SIOs
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Input)
Group By ts, est_date, week_start
;
-Rob