Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
kfir1987
Contributor II
Contributor II

Recalculation of utilization per line

Hi All,

I am trying  to calculate utilization per machine over the time (the time frame is daily), so I actually I want to see utilization for each machine over the days.


My raw data coming in the below structure:

Machine SNEnd TimeStart TimeActual time

I am facing issue when I try to calculate utilization (the utilization is not a column, I calculated it as Master Item)  , in some machines the start date and the end date is not in the same time ( it means the machine worked over 24 H) ! So as a result I can see that in some cases the utilization value is more than 100% ( it impossible). I`m trying to create a view that show each S\N and the machine utilization over the last 30 days. In cases that machine worked more than 1 day, I want to show the relative percentage on the first day and the last day and in between days to show 100%. 

for example:

the raw data is:

Machine SNEnd TimeStart TimeActual time (min)
111125/02/2018  12:38:3623/02/2018  22:43:49
2266.5

The machine worked for 1.5 days, so my expectation:

on the 23/02/18 - the machine worked for 1:16:11 H (24:00:00-22:43:49) so the utilization for this day is  5% (1:16:11/24:00:00)

on the 24/02/18 - the machine worked for 24 H so the utilization should be 100%

on the 25/02/18 - the machine worked for 12:38:36 H  so the utilization for this day is  53% (12:38:11/24:00:00)

Is it possible to do it ?

Thanks for the help.

6 Replies
Or
MVP
MVP

Possible? Yes. You can loop through any row where the start date and end date are not the same, and split those into multiple rows (two, usually, but could be more if it covers more than two days). You can get a general idea of the loop involved in this thread: How to split a single row value into multiple r... | Qlik Community

Recommended? No. We deal with the same issue across multiple data sources and it was determined that the correct way to handle this is to have the data source split the lines to begin with.

kfir1987
Contributor II
Contributor II
Author

why this approach is not recommended ?

Or
MVP
MVP

In my experience, splitting lines with Qlik loops is terrible for performance. If you decide to go with that approach (and your information volume isn't trivial), make sure you set up an incremental reload so you don't have to handle large data volumes. Best bet is to set up a quick loop and see for yourself whether or not the performance is adequate for your needs.

Two other approaches you might be able to use here, if needed:

1) An arbitrary number of identical loads each handling a different multiple-day case (read the same data each time - the first time, for lines that start and end on the same day. The second, for lines that start and end on different days and take the start day. The third, different start/end date and take the second day. The fourth, take the third day, and so on). Your formula would check either vs the end date (if it matches the day you are currently reading) or midnight, as appropriate. I think this would offer better performance than looping for most cases.

2) Pre-streamline the information in your database using e.g. SSIS or using a cursor.

kfir1987
Contributor II
Contributor II
Author

Do you know how to do it on the editor ?

marcus_sommer

It might depend on the way the loop is performed if you get a rather slow performance or not. Quite slow are approaches which use an outside-loop like:

for each ...

     load ...

next

but an inside-loop like:

load subfield(Field, Delimiter) as Field1 from Source;

     or

load date(Date + iterno() - 1) as Date from Source while Date + iterno() - 1 <= EndDate;

is usually very fast even by millions of records whereby an incremental approach might be further helpful to minimize the run-times.

- Marcus

Or
MVP
MVP

Absolutely correct, of course. In this case, though, I was not able to figure out a way to use inside loops and still get the desired outcome. That said, I last attempted this several years ago and even then it's highly possible that I missed a working solution at the time. Unfortunately, this means I can't help the OP with code - my solution was to address the source data (we now record machine work time once per hour, regardless of whether the machine is working or not, which means I just have to sum up the hours as desired; this also means we aren't limited to full-day calculations, which the loop solution here would likely be).