Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
slewis
Contributor III
Contributor III

Global Error Alert - Failed to connect & Connect timeout occurred

Several days ago Qlik Replicate (2022.11.0.394) began sending "Global Error Alert" emails for SQL Server source endpoints (Failed to connect) and Oracle source endpoints (Connect timeout occurred). The tasks include both CDC tasks which are already running as well as Full-Load only tasks which are starting based on a schedule. The full-load only tasks DO start up and DO start replicating and the CDC tasks DO continue to apply changes. However, the timing of the email alerts seem to coincide with the top and bottom of the hour full-load schedules.

Looking in both the Replicate Server Logs and the individual Task Logs, the errors reported in the emails are no where to be found. I've raised the logging levels on both the Server and individual Task logs to verbose for the following sections and still see nothing indicating a problem: COMMON, COMMUNICATION, SOURCE_CAPTURE, SOURCE_UNLOAD, and UTILITIES. I'm listing all of the sections together and not differentiating if they are available in only Task logging or Server logging.

The Replicate server is installed in an AWS EC2 instance. The SQL Server and Oracle database servers are installed in a different AWS account with networking established to allow interaction between. To restate, the error emails are stating there is a connection problem, yet the tasks are running successfully.

With all of this said, I'm looking for:

1. Ideas on how to troubleshoot why this Global Alert email is being sent?
2. What logging do I need to turn on to actually see the email being sent?
3. What information can the logs provide to help diagnose why the alert is being sent?

I understand that the source endpoints are having trouble establishing connections and the "problem" likely isn't inside of Qlik Replicate. But I need better information before I can take a corrective step forward.

Labels (2)
3 Replies
OritA
Support
Support

Hi, 

Reagrding your questions:
1. Ideas on how to troubleshoot why this Global Alert email is being sent?

==> the email alerts are sent based on the notification setting that you have set in the Replicate server. First thing, we recommend to check the rules setting and see if it fits your environment behaviour. In general  the notification rules are defined under server --> notification 

2. What logging do I need to turn on to actually see the email being sent?

==> Whenever an email is send you will see it in the repsrv.log or if it is specific task related it can also appear in the task log. 
3. What information can the logs provide to help diagnose why the alert is being sent?

In the log you will see the notification type and the time of the email and this could point you to the relevant rule and cause of the notification. In addition to get more details you can enable verbose on UTILITY component at the time the notification is being sent.  

In general if you would like to get help in troubleshooting this issue please open case in Salesforce and attach to the case the task diagnostic package, the repsrv.log  and the time of the email notification so we can start troubleshooting it cause. 

Thanks & regards,

Orit

slewis
Contributor III
Contributor III
Author

OritA,

Thank you for your suggestions. I attempted to use your advice to get more insights into my issue, but had no luck. I'll describe below what I did.

Firstly, I set both the Server and Task level UTILITIES logging to Verbose.
I then waited for the repsrv.log file to roll over.
 
For my test task, a scheduling system kicks off the full-load only task at the top of each hour.
 
Here's a high level of what happened over a two hour period:
 
11:00 am    The task initiates
11:11 am    Task ends
11:49 am    2 [Global Alert Error] notifications arrive via email
 
No [Global Alert Error] found in the reptask_task_name.log file. 
No [Global Alert Error] found in the repsrv.log file
 
12:00 pm    The task initiates
12:11 pm    Task ends
12:20 pm    2 [Global Alert Error] notifications arrive via email
 
No [Global Alert Error] found in the reptask_task_name.log file.
No [Global Alert Error] found in the repsrv.log file
 
There's an obvious lag from the time the [Global Alert Error] notifications are sent and when they actually land in my inbox. Fine, I can accept that.

However, I'm still not seeing any issues in the logs that I'm reviewing.
It seems to me that the error must be occurring in a different logging area.
 
Interestingly, In a different task that throws occasional errors I can see the [Global Alert Error] Notification in the Task log, but not for this error.
slewis
Contributor III
Contributor III
Author

New Details:

Over the weekend one of our source systems was down for maintenance for approximately 24 hours. During this time the Qlik Replicate jobs for that source system were shut down and were not restarted until after maintenance ended.

Even with the tasks shut down, the Replicate server continued to send the Alert emails.

Do you have an explanation why notifications would be sent for a task, by name, that isn't running?

Example:

Email 1: Log stream task
s_peoplesoft_to_t_peoplesoft_ls replication task encountered the following error: OCI error..
Email 2: Log stream task
s_peoplesoft_to_t_peoplesoft_ls replication task encountered the following error:Creating Metadata Manager's utility components failedCannot create the source utility componentFailed while preparing stream component 's_peoplesoft'.ORA-12170: TNS:Connect timeout occurred .

Reiterating, the task wasn't running when this notification was sent.