Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi guys,
I followed a technique described in this Qlik Community thread where @johnw explains to create a generic calendar:
https://community.qlik.com/t5/QlikView-Scripting/Link-to-generic-calendar/td-p/230210
The key is to create a "linkage" table like this:
MasterLinkage:
LOAD
event_code,
'Event' as DataType,
event_start_date as Date_A,
event_link as link_id
RESIDENT event
;
CONCATENATE (MasterLinkage)
LOAD
link_data_internal_id,
'Traffic' as DataType,
link_date as Date_A,
traffic_link as link_id
RESIDENT traffic;
This way I can use Date_A to filter on event_start_date and link_date at the same time, which is awesome.
However, notice that I had to create that link_data_internal_id unique identifier that otherwise I would not have. Since I started using it, my qvds and my app have grown insanely. I'm assuming it's because of the cardinality. I have millions of traffic records, so having a unique identifier does not seem like a good idea.
What can I do? How could I keep the "linkage" to the generic calendar without using a unique id for the traffic records?
Thanks much!
Juan
I've replaced the unique id "link_data_internal_id" with three fields:
Date(Floor("start_date")) AS date_key,
Time(Round(Frac(Time("start_date",'hh:mm')),1/96),'hh:mm') as time_key,
item_id as link_key,
These three fields identify uniquely each record. And since I avoid the cardinality issue, my qvds are now 80% smaller than they used to be!
However, the three fields are creating a syntetic key that is slowing the load script... I'm going to be testing if the new loading times are acceptable.
Even though everything works as desired, I'm a bit concerned about the synthetic key because it's definitely affecting the reload duration.
The tables involved are:
[event]:
LOAD
[event_id],
[code] as event_code,
[description] as event_description,
[event_type_id],
[event_status_id],
[event_impact_id],
[event_location],
[city_id] as event_city_id,
[link_id] as event_link,
[facility_way_id] as event_roadway_id,
[start_timestamp] as event_start_timestamp,
Date(Floor(start_timestamp)) AS event_start_date,
Floor(Hour(start_timestamp)) As event_start_hour,
[close_timestamp] as event_close_timestamp,
event_duration;
[traffic]:
LOAD
Date(Floor("start_date")) AS date_key,
Time(Round(Frac(Time("start_date",'hh:mm')),1/96),'hh:mm') as time_key,
item_id as link_key,
Date(Floor("start_date")) AS link_date,
Time(Round(Frac(Time("start_date",'hh:mm')),1/96),'hh:mm') as "link_time_15min",
Floor(Hour(Time(Round(Frac(Time("start_date",'hh:mm')),1/96),'hh:mm'))) as link_hour,
"item_id" as traffic_link,
Round("speed_mph") as "link_speed_mph";
The idea is that when using this calendar, the elements from both [events] and [traffic] get filtered:
Calendar_A:
Load
TempDate AS Date_A,
week(TempDate) As Week_A,
Year(TempDate) As Year_A,
Month(TempDate) As Month_A,
Year(TempDate)&'-'& num(Month(TempDate),'00') as YearMonth_A,
Day(TempDate) As Day_A,
Week(weekstart(TempDate)) & '-' & WeekYear(TempDate) as WeekYear_A,
WeekDay(TempDate) as WeekDay_A
Resident TempCalendar
Order By TempDate ASC;
To link both tables to the calendar, I've created the tables below:
MasterLinkage:
LOAD
event_code,
'Event' as DataType,
event_start_date as Date_A,
event_link as link_id
RESIDENT event
;
CONCATENATE (MasterLinkage)
LOAD
date_key,
time_key,
link_key,
'Traffic' as DataType,
link_date as Date_A,
traffic_link as link_id
RESIDENT traffic;
It works, but as said, having the date_key, time_key, link_key synthetic key makes the reload slow. What would you say is the best approach to this kind of challenge?
Thanks