Re: Alert when table become unavaibale - Qlik Community

simonB2020 · ‎2023-11-13

We had a case recently when a source table became 'unavailable' to QR. (Full Load Task)
To be clear, through permissions on the source side, QR was unable to see table at all.
The result was that the table simply stopped replicating and went unnoticed for some time.

If one visually compares the "Patterns & Selected Tables" list, the table is listed.
On the "Full Table List" the table is not listed.
That's the only evidence you see.

So is there a way to detect differences between the required tables and the ones actually recognized by QR ?

Thanks

Steve_Nguyen · ‎2023-11-13

@simonB2020 currently i do not see anyway to see the :

If one visually compares the "Patterns & Selected Tables" list, the table is listed.
On the "Full Table List" the table is not listed.

this would be a feature request :

https://community.qlik.com/t5/Ideation/ct-p/qlik-product-insight

Help users find answers! Don't forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Heinvandenheuvel · ‎2023-11-13

Indeed when a table is listed as included in the JSON but not there, or not properly accessibl, then Replicate will silently ignore it. There will be nothing in the log. The 'missing' table will only be listed in METADATA_MANAGE DEBUG logging mode, but only in the middle of a select statement with no indication whether it was returned or not.

Can you not simply use REPCTL GETTABLESSTATUS or the AEM API GetTableStatuses and see if all included tables are indeed present.

You may also want to look at the FL_LAST_TIME epoch to make sure you are not looking at stale info.

Below as silly example for a single remaining 'test' tables, when a second table was no longer there. That second table did show up in the list earlier.

Cheers, Hein.

>>> repctl -d \Replicate\data gettablesstatus Oracle_to_Oracle last_update_time=-1
command gettablesstatus response:
{
        "tables_status":        [{
                        "name": "TEST",
                        "owner":        "ATT_USER",
                        "metadata_id":  1,
                        "state":        "TABLE_COMPLETED",
                        "est_row_count":        8,
                        "fl_total_counter":     8,
                        "fl_total_volume":      3488,
                        "fl_start_time":        1699928349091950,
                        "fl_last_time": 1699928349341967,
                        "fl_total_time":        250017
                }],
        "counters":     {
                "tables_completed_counter":     1,
                "tables_loading_counter":       0,
                "tables_queued_counter":        0,
                "tables_canceled_counter":      0,
                "tables_error_counter": 0
        },
        "update_time":  1699928381087438
}
[gettablesstatus command] Succeeded

simonB2020 · ‎2023-11-14

The problem with 'after the fact' APIs and other checks, is that you need to compare the output with the Task definition JSON in order to identify the missing items.
So you are not building a process to extract that info from the task definition, then another process to get statuses, and comparing them - it's all extra work.
Besides if you have a slowly changing table, how do you know when it is 'stale' ?
Some table may have records updated daily, others only once in a blue moon - there is no definitive definition for 'stale' 😞

Heinvandenheuvel · ‎2023-11-14

>> Besides if you have a slowly changing table, how do you know when it is 'stale' ?

You talked about " (Full Load Task)" are you now referring to CDC? That's an whole other, more difficult, challenge.

>> The problem with 'after the fact'

Given the current design of Replicate, there cannot be in the moment detection. All tables are treated as optional - if they are there - that is they are returned from the presence query then they are included. Carefully study the task log output from a small (5 table) task with metamanager and source_unload logging set to trace and see for yourself how Replicate 'finds' tables with a single complex query.

Now Replicate could switch that out to a single query per included table-match-include-entry. And with that in place there could be a per-entry 'must be there' or a global 'at least one match per include flag' but it could not require a find per table by default as other user count on the soft approach. And what to do with multi-table matches? Make those have a minimum, maximum or exact match count?

Btw... for CDC Replicate will pick up 'missing tables' as they get created.

>> Some table may have records updated daily, others only once in a blue moon

They are either there, or not there, when a full-load task (as per original query is started.

>> So you are not building a process to extract that info from the task definition,

Yup, afraid so. All good, honest work. Personally I would see the JSON as something to be checked, not a driver. I would design me application with a CSV/Excel/DB table with a list of tables with various attributes - name, expected size, required flag, expected change rate. They I would have a daily/weekly job to read the requirement table and make sure the active JSON and Runtime handling is consistent with the design as encapsulated that driver/design table. I might have child table updated with the table plus sample time as FK and fill that with 'last-load-date-time', last row count, last change rate, and whatever else seems useful on a per-check history basis.

>> there is no definitive definition for 'stale'

Correct. All replicate output is instantly stale, some more so than others, most acceptably so, but what is 'acceptable'? To handle that is why they pay you the big bucks!

🙂

Hein.

simonB2020 · ‎2023-11-19

Ok so let's step bac and take a simple FL Task.
Developer adds 10 tables and schedules to run every day.
6 months later someone messes up permissions at the source and QR now only replicates 9 tables.
This is a big deal & someone needs to be alerted to the fact.
(Consumers of the target data are now ignorantly using stale data which could impact business decisions.)

I think what you are saying above is simply "QR does not work like that".
My response as a customer is simply that "I need to it to work like that" .
At the end of that day, I am requesting 10 tables and it can only find 9 - is it too much to ask for it to tell me that ?
"All tables are treated as optional"
Why on earth would anyone include a table in their 'required' list if they did not need it ?

TBH I have to justify spending the money we do on tools like QR - so now imagine that I have to tell by boss that I've got to spend yet more money for a developer to build an app to make sure that the app we are already spending money on is doing what we expect it to do ?

Dana_Baldwin · ‎2023-11-20

Hi @simonB2020

When the table became unavailable to the task, there should have been a warning or error captured in the task log, can you please confirm? If so, you can configure Replicate to notify you by email and/or write to the system log on the server when errors & warnings occur. Does this sound like it would meet your needs?

Notifications settings #Notifications settings | Qlik Replicate Help

Thanks,

Dana

Heinvandenheuvel · ‎2023-11-20

@Dana_Baldwin - "there should have been a warning"

I don't think so - seem my earlier reply. And yeah I tried before testing.

@simonB2020 - " "I need to it to work like that"

Then you will need to submit an improvement request. Write the request on the back of a $100,000 bill to improve you chances of actually getting it.

Seriously in the original (2013?) design, all table names were always wildcarded, adding a '*' to the name before searching. When that caused issues the "Use exact table name" option was added, and you had to add your own star if you wanted wildcard (as it should have been from the get-go imho).

It is not unreasonable to create an improvement request to add a 'must be present' as a global flag or a per-entry flag. I suspect it would need a significant code overhaul.

"6 months later someone messes up permissions at the source and QR now only replicates 9 tables."

Let's face it. That's a customer failure to provide adequate change control. Replicate should try to help where it can, but here that collided with a long standing implementation choice for Replicate. Tough luck!

It's really not all that hard to write a script to verify either from the reptask logs, or from the REPCTL 'api' to make sure it all worked as expected after a run with a ton more functionality than Replicate could ever provide. The Enterprise Manager API might be cleaner, but if that is not in place already it is likely too much of an investment.

An external script can check whether the task ran at all! It can check whether all desired tables where handled. It can verify whether they were handled in a reasonable timeframe. It can check whether the number rows processed was reasonable - between some MIN and MAX - All this through a simple 'driver' table with something like table-name, row min, row max, last-count, current-count, last-load-dt, current-load-dt. And you may well add 'responsible team', phone-number and whatever more to help you when there is trouble in paradise.

Just an independent opinion trying to help!

I used to work for Attunity but do not work for Qlik and don't care who is right, wrong or otherwise.

Cheers,
Hein.

srishti_datta · ‎2023-11-26

If the source table permissions got changed then Qlik Replicate will have no way to connect to the table but it would definitely put an error or warning in the logs. You should review your global error handling to handle such table issues, if suspend table was selected and notifications were setup for suspended tables then you would have got notified of such instance.

Heinvandenheuvel · ‎2023-11-28

@srishti_datta "it would definitely put an error or warning in the logs. "

Did you test that? I did, and there is nothing in the log nor could there be in the current design as I explained.

@simonB2020 wrote "QR was unable to see table at all."

In that case Replicate will NOT know about the table at all as it is not in the erturn dataset for the select owner,tablename query it starts out with.

Hein.

Alert when table become unavaibale

Configuration

Errors - Unexpected Behavior