Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
RyeGuy
Contributor
Contributor

Can Compose be a viable solution for a large Enterprise DW?

Hello,

I am a week into developing a POC for using Compose. The goal of this is to demonstrate that Compose could replace our existing SSIS and stored procedure methodology to simplify things. A few concerns have come up and am hoping for some guidance.

The extent of what we would eventually want to replace is the ingestion of several hundred tables from Qlik Replicate and then processing to build the Enterprise DW. This includes tables from multiple source systems. 

Concerns

1. Model development - Because my warehouse includes multiple source systems, common tables like Address, Customer, and State may exist in all of the sources. And due to the number of tables I am dealing with it seems very difficult to look at the Manage Model UI and understand where a table originated from. I realize that the lineage is available but only if you select a table and utilize the drop down one at a time. (Unfortunately the lineage view only displays 23 characters). Seems like the UI could allow for sorting/filtering by Source DB and Schema. Is my only option to rename all Entities like Address_SourceA, Address_SourceB?

2. Warehouse Tasks - It appears that I would have to create a separate set of Tasks for CDC vs non-CDC which means 2x the maintenance. And with hundreds of mappings plus the possibility of Pre/Post and Table ETLs I just don't see how this can be managed efficiently. Even if you use the "Enable Only" button you still see all of the Logical Entities. There does not seem like a concise way to simply see that Task X includes mappings y, z and the Pre/Post loading ETL. The Task Statements are too granular to really see the overview. There is also no 'version history' for a Task. So if you make a change, or if someone else does, you have no idea what happened. I get there is a Version Control at a higher level, but no visibility into changes.

3. Data Marts - The lineage is great, but it requires editing the Dimensional attribute and selecting "Show Lineage".  No way to quickly look at a DM object and easily understand it well.

4. Work Flow - I don't see how this could manage the complexities of loading a DW with dependencies and precedence.  Especially if there are many Warehouse Tasks like I believe we would need. There does not seem to be a way to disable a give tables flow or allow for a restart/reposition if there is something that went wrong in in the flow.

Any insights/experience is greatly appreciated.

Thank you

 

 

 

 

 

Labels (4)
1 Solution

Accepted Solutions
Brian_Jones
Employee
Employee

First, a question: are you also exploring Qlik Cloud Data Integration (QCDI), as well as Qlik Compose in your integration redesign POC? QCDI is a very modern, SaaS driven approach to data integration pipelines that can be more flexible than Compose. If you haven’t considered it, I suggest you contact your Qlik representatives and ask them about it.

Next, a general comment that a warehouse automation solution like Compose requires a model-driven approach to take full advantage of time-saving code generation in the product. The automation flows from the model, which is a bit different than an ETL and stored procedure approach.

  1. Model development - Because my warehouse includes multiple source systems, common tables like Address, Customer, and State may exist in all of the sources. And due to the number of tables I am dealing with it seems very difficult to look at the Manage Model UI and understand where a table originated from. I realize that the lineage is available but only if you select a table and utilize the drop down one at a time. (Unfortunately the lineage view only displays 23 characters). Seems like the UI could allow for sorting/filtering by Source DB and Schema. Is my only option to rename all Entities like Address_SourceA, Address_SourceB?

The following Compose whitepaper describes the options for modeling data from multiple sources. When multiple sources overlap heavily, labeling the landed source entities is the best approach.

https://community.qlik.com/t5/Official-Support-Articles/Qlik-Compose-Data-Warehouse-Modeling-Best-Pr...

Other Compose whitepapers are labeled in Community with a ComposeWP tag. I suggest using the Classic Search under Support | Knowledge | Member Articles as in the link below to locate them.

https://community.qlik.com/t5/forums/searchpage?q=ComposeWP&noSynonym=false&collapse_discussion=true

 

  1. Warehouse Tasks - It appears that I would have to create a separate set of Tasks for CDC vs non-CDC which means 2x the maintenance. And with hundreds of mappings plus the possibility of Pre/Post and Table ETLs I just don't see how this can be managed efficiently. Even if you use the "Enable Only" button you still see all of the Logical Entities. There does not seem like a concise way to simply see that Task X includes mappings y, z and the Pre/Post loading ETL. The Task Statements are too granular to really see the overview. There is also no 'version history' for a Task. So if you make a change, or if someone else does, you have no idea what happened. I get there is a Version Control at a higher level, but no visibility into changes.

Fair point about needing a separate full load and CDC task group in a Compose solution, but I’d also point out that the mappings are reading from different source tables (base tables vs. change tables) and generating different SQL. Compose provides a lot of automation in return for an extra configuration and management step. QCDI does eliminate the need for separate full load and CDC task groups.

Another area where QCDI offers improved manageability is the searching and filtering of objects. It also helps with manageability that modeling in QCDI is deferred to the transformation task stage, allowing multiple subject area models to exist in a project, rather that the single canonical model for all objects as in a Compose project.

 

  1. Data Marts - The lineage is great, but it requires editing the Dimensional attribute and selecting "Show Lineage".  No way to quickly look at a DM object and easily understand it well.

I believe one has to generate and explore the Compose documentation set to view lineage at the data mart level.

QCDI does also produce data marts and lineage is available in the GUI at the mart level.

 

  1. Work Flow - I don't see how this could manage the complexities of loading a DW with dependencies and precedence.  Especially if there are many Warehouse Tasks like I believe we would need. There does not seem to be a way to disable a give tables flow or allow for a restart/reposition if there is something that went wrong in in the flow. Any insights/experience is greatly appreciated. Thank you        

More complex task streams are frequently implemented using 3rd party scheduling tools. Each object you see in a Qlik Compose workflow can be executed as a command line task that can be implemented using such scheduling tools.

View solution in original post

3 Replies
Brian_Jones
Employee
Employee

First, a question: are you also exploring Qlik Cloud Data Integration (QCDI), as well as Qlik Compose in your integration redesign POC? QCDI is a very modern, SaaS driven approach to data integration pipelines that can be more flexible than Compose. If you haven’t considered it, I suggest you contact your Qlik representatives and ask them about it.

Next, a general comment that a warehouse automation solution like Compose requires a model-driven approach to take full advantage of time-saving code generation in the product. The automation flows from the model, which is a bit different than an ETL and stored procedure approach.

  1. Model development - Because my warehouse includes multiple source systems, common tables like Address, Customer, and State may exist in all of the sources. And due to the number of tables I am dealing with it seems very difficult to look at the Manage Model UI and understand where a table originated from. I realize that the lineage is available but only if you select a table and utilize the drop down one at a time. (Unfortunately the lineage view only displays 23 characters). Seems like the UI could allow for sorting/filtering by Source DB and Schema. Is my only option to rename all Entities like Address_SourceA, Address_SourceB?

The following Compose whitepaper describes the options for modeling data from multiple sources. When multiple sources overlap heavily, labeling the landed source entities is the best approach.

https://community.qlik.com/t5/Official-Support-Articles/Qlik-Compose-Data-Warehouse-Modeling-Best-Pr...

Other Compose whitepapers are labeled in Community with a ComposeWP tag. I suggest using the Classic Search under Support | Knowledge | Member Articles as in the link below to locate them.

https://community.qlik.com/t5/forums/searchpage?q=ComposeWP&noSynonym=false&collapse_discussion=true

 

  1. Warehouse Tasks - It appears that I would have to create a separate set of Tasks for CDC vs non-CDC which means 2x the maintenance. And with hundreds of mappings plus the possibility of Pre/Post and Table ETLs I just don't see how this can be managed efficiently. Even if you use the "Enable Only" button you still see all of the Logical Entities. There does not seem like a concise way to simply see that Task X includes mappings y, z and the Pre/Post loading ETL. The Task Statements are too granular to really see the overview. There is also no 'version history' for a Task. So if you make a change, or if someone else does, you have no idea what happened. I get there is a Version Control at a higher level, but no visibility into changes.

Fair point about needing a separate full load and CDC task group in a Compose solution, but I’d also point out that the mappings are reading from different source tables (base tables vs. change tables) and generating different SQL. Compose provides a lot of automation in return for an extra configuration and management step. QCDI does eliminate the need for separate full load and CDC task groups.

Another area where QCDI offers improved manageability is the searching and filtering of objects. It also helps with manageability that modeling in QCDI is deferred to the transformation task stage, allowing multiple subject area models to exist in a project, rather that the single canonical model for all objects as in a Compose project.

 

  1. Data Marts - The lineage is great, but it requires editing the Dimensional attribute and selecting "Show Lineage".  No way to quickly look at a DM object and easily understand it well.

I believe one has to generate and explore the Compose documentation set to view lineage at the data mart level.

QCDI does also produce data marts and lineage is available in the GUI at the mart level.

 

  1. Work Flow - I don't see how this could manage the complexities of loading a DW with dependencies and precedence.  Especially if there are many Warehouse Tasks like I believe we would need. There does not seem to be a way to disable a give tables flow or allow for a restart/reposition if there is something that went wrong in in the flow. Any insights/experience is greatly appreciated. Thank you        

More complex task streams are frequently implemented using 3rd party scheduling tools. Each object you see in a Qlik Compose workflow can be executed as a command line task that can be implemented using such scheduling tools.

RyeGuy
Contributor
Contributor
Author

Thank you for your reply Brian.

We have not explored Qlik Cloud Data Integration (QCDI), Compose was available under our current licensing for Replicate so we decided to explore it as an alternative.

Thank you for the link on the modeling best practices. The document does a good job of explaining the modeling concepts for Compose. I wish I would have run into it a few days ago. 

I believe your responses have answered my question. It does appear that Compose is not an optimal solution for an Enterprise DW. And perhaps we would be better served by looking at Qlik Cloud Data Integration.

Thanks again

 

 

 

 

Brian_Jones
Employee
Employee

You are most welcome. I am glad the responses and whitepaper links helped. While the definition of an optimal solution is subjective, Qlik Compose does have numerous customers managing data warehouses in production with hundreds of model entities and data mappings.

However, for customers looking at a new data warehouse build in 2024, QCDI is the data integration solution "of the future" and the one they should be evaluating first.

Cheers,