The Query Healer page describes the following:
The Query Healer periodically examines the progress of running statements, creating a log entry for all statements exceeding a defined time period.
The following Administration Worker flags are required to configure the Query Healer:
The Is Healer On enables and disables the Query Healer.
The Max Statement Inactivity Seconds worker level flag defines the threshold for creating a log recording a slow statement. The log includes information about the log memory, CPU and GPU. The default setting is five hours.
The Healer Detection Frequency Seconds worker level flag triggers the healer to examine the progress of running statements. The default setting is one hour.
The following is an example of a log record for a query stuck in the query detection phase for more than five hours:
|INFO|0x00007f9a497fe700:Healer|192.168.4.65|5001|-1|master|sqream|-1|sqream|0|"[ERROR]|cpp/SqrmRT/healer.cpp:140 |"Stuck query found. Statement ID: 72, Last chunk producer updated: 1.
Once you identify the stuck worker, you can execute the
shutdown_server utility function from this specific worker, as described in the next section.
You can activate a graceful shutdown if your log entry says
Stuck query found, as shown in the example above. You can do this by setting the shutdown_server utility function to
To activte a graceful shutdown:
Locate the IP and the Port of the stuck worker from the logs.
The log in the previous section identifies the IP (192.168.4.65) and port (5001) referring to the stuck query.
From the machine of the stuck query (IP: 192.168.4.65, port: 5001), connect to SQream SQL client:
./sqream sql --port=$STUCK_WORKER_IP --username=$SQREAM_USER --password=$SQREAM_PASSWORD databasename=$SQREAM_DATABASE
For more information, see the following:
Activating the SHUTDOWN SERVER utility function. This page describes all of
Configuring the shutdown_server flag.