Logging

Locating the log files

The storage cluster contains a logs directory. Each worker produces a log file in its own directory, which can be identified by the worker’s hostname and port.

Note

Additional internal debug logs may reside in the main logs directory.

The worker logs contain information messages, warnings, and errors pertaining to SQream DB’s operation, including:

  • Server start-up and shutdown
  • Configuration changes
  • Exceptions and errors
  • User login events
  • Session events
  • Statement execution success / failure
  • Statement execution statistics

Log structure and contents

The log is a CSV, with several fields.

Log fields
Field Description
#SQ# Start delimiter. When used with the end of line delimiter can be used to parse multi-line statements correctly
Row Id Unique identifier for the row
Timestamp Timestamp for the message (ISO 8601 date format)
Information Level Information level of the message. See information level table below
Thread Id System thread identifier (internal use)
Worker hostname Hostname of the worker that generated the message
Worker port Port of the worker that generated the message
Connection Id Connection Id for the message. Defaults to -1 if no connection
Database name Database name that generated the message. Can be empty for no database
User Id User role that was connected during the message. Can be empty if no user caused the message
Statement Id Statement Id for the message. Defaults to -1 if no statement
Service name Service name for the connection. Can be empty.
Message type Id Message type Id. See message type table below)
Message Content for the message
#EOM# End of line delimiter
Information level
Level Description
SYSTEM System information like start up, shutdown, configuration change
FATAL Fatal errors that may cause outage
ERROR Errors encountered during statement execution
WARNING Warnings
INFO Information and statistics
Message type
Type Level Description Example message content
1 INFO Statement start information
  • "Query before parsing (statement handle opened)
  • "SELECT * FROM nba WHERE ""Team"" NOT LIKE ""Portland%%""" (statement preparing)
2 INFO Statement passed to another worker for execution
  • ""Reconstruct query before parsing"
  • "SELECT * FROM nba WHERE ""Team"" NOT LIKE ""Portland%%""" (statement preparing on node)
4 INFO Statement has entered execution "Statement execution"
10 INFO Statement execution completed "Success" / "Failed"
20 INFO Compilation error, with accompanying error message "Could not find function dateplart in catalog."
21 INFO Execution error, with accompanying error message Error text
30 INFO Size of data read from disk in megabytes 18
31 INFO Row count of result set 45
32 INFO Processed Rows 450134749978
100 INFO Session start - Client IP address "192.168.5.5"
101 INFO Login "Login Success" / "Login Failed"
110 INFO Session end "Session ended"
200 INFO SHOW_NODE_INFO periodic output  
500 ERROR Exception occured in a statement "Cannot return the inverse cosine of a number not in [-1,1] range"
1000 SYSTEM Worker startup message "Server Start Time - 2019-12-30 21:18:31, SQream ver{v2020.1}"
1002 SYSTEM Metadata Metadata server location
1003 SYSTEM Show all configuration values
"Flags configuration:
   compileFlags, extendedAssertions, false, true;
   compileFlags, useSortMergeJoin, false, false;
   compileFlags, distinctAggregatesOnHost, true, false;
   [...]"
1004 SYSTEM SQream DB metadata version "23"
1010 FATAL Fatal server error "Mismatch in storage version, upgrade is needed,Storage version: 22, Server version is: 23"
1090 INFO Configuration change Successful set config useSortMergeJoin to value: true
1100 SYSTEM Worker shutdown "Server shutdown"

Log naming

Log file name syntax

sqream_<date>_<sequence>.log

  • date is formatted %y%m%d, for example 20191231 for December 31st 2019.

    By default, each worker will create a new log file every time it is restarted.

  • sequence is the log’s sequence. When a log is rotated, the sequence number increases. This starts at 000.

For example, /home/rhendricks/sqream_storage/192.168.1.91_5000.

See the Change log rotation below for information about controlling this setting.

Logging control and maintenance

Change log verbosity

A few configuration settings alter the verbosity of the logs

Log verbosity configuration
Flag Description Default Values
logClientLevel Used to control which log level should appear in the logs 4 (INFO) 0 SYSTEM (lowest) - 4 INFO (highest). See information level table above.
nodeInfoLoggingSec Sets an interval for automatically logging long-running statements’ SHOW_NODE_INFO output. Output is written as a message type 200. 60 (every minute) Positive whole number >=1.

Change log rotation

A few configuration settings alter the log rotation policy

Log rotation configuration
Flag Description Default Values
useLogMaxFileSize Rotate log files once they reach a certain file size. When true, set the logMaxFileSizeMB accordingly. false false or true.
logMaxFileSizeMB Sets the size threshold in megabytes after which a new log file will be opened. 20 1 to 1024 (1MB to 1GB)
logFileRotateTimeFrequency Frequency of log rotation never daily, weekly, monthly, never

Collect logs from your cluster

Collecting logs from your cluster can be as simple as creating an archive from the logs subdirectory: tar -czvf logs.tgz *.log

However, SQream DB comes bundled with a data collection utility and an SQL utility intended for collecting logs and additional information that can help SQream support drill down into possible issues.

SQL Syntax

SELECT REPORT_COLLECTION(output_path, mode)
;

output_path ::=
   filepath

mode ::=
   log | db | db_and_log

Command line utility

If SQream DB can’t be accessed for any reason, a command line tool can also be used to collect the same information:

$ ./bin/report_collection <path to storage> <path for output> <mode>

Parameters

Parameter Description
output_path Path for the output archive. The output file will be named report_<date>_<time>.tar.
mode One of three modes: * 'log' - Collects all log files * 'db' - Collects the metadata database (includes DDL, but no data) * 'db_and_log' - Collect both log files and metadata database

Example

Write an archive to /home/rhendricks, containing log files:

SELECT REPORT_COLLECTION('/home/rhendricks', 'log')
;

Write an archive to /home/rhendricks, containing log files and metadata database:

SELECT REPORT_COLLECTION('/home/rhendricks', 'db_and_log')
;

Using the command line utility:

$ ./bin/report_collection /home/rhendricks/sqream_storage /home/rhendricks db_and_log

Troubleshooting with logs

Loading logs with foreign tables

Assuming logs are stored at /home/rhendricks/sqream_storage/logs/, a database administrator can access the logs using the Foreign Tables concept through SQream DB.

CREATE FOREIGN TABLE logs
(
  start_marker      VARCHAR(4),
  row_id            BIGINT,
  timestamp         DATETIME,
  message_level     TEXT,
  thread_id         TEXT,
  worker_hostname   TEXT,
  worker_port       INT,
  connection_id     INT,
  database_name     TEXT,
  user_name         TEXT,
  statement_id      INT,
  service_name      TEXT,
  message_type_id   INT,
  message           TEXT,
  end_message       VARCHAR(5)
)
WRAPPER cdv_fdw
OPTIONS
  (
     LOCATION = '/home/rhendricks/sqream_storage/logs/**/sqream*.log',
     DELIMITER = '|'
  )
;

Count message types

t=> SELECT message_type_id, COUNT(*) FROM logs GROUP BY 1;
message_type_id | count
----------------+----------
              0 |         9
              1 |      5578
              4 |      2319
             10 |      2788
             20 |       549
             30 |       411
             31 |      1720
             32 |      1720
            100 |      2592
            101 |      2598
            110 |      2571
            200 |        11
            500 |       136
           1000 |        19
           1003 |        19
           1004 |        19
           1010 |         5

Find fatal errors

t=> SELECT message FROM logs WHERE message_type_id=1010;
Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/leveldb/LOCK: Resource temporarily unavailable
Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/leveldb/LOCK: Resource temporarily unavailable
Mismatch in storage version, upgrade is needed,Storage version: 25, Server version is: 26
Mismatch in storage version, upgrade is needed,Storage version: 25, Server version is: 26
Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/LOCK: Resource temporarily unavailable

Count error events within a certain timeframe

t=> SELECT message_type_id,
.          COUNT(*)
.   FROM logs
.   WHERE message_type_id IN (1010,500)
.   AND timestamp BETWEEN '2019-12-20' AND '2020-01-01'
.   GROUP BY 1;
message_type_id | count
----------------+------
            500 |    18
           1010 |     3

Tracing errors to find offending statements

If we know an error occured, but don’t know which statement caused it, we can find it using the connection ID and statement ID.

t=> SELECT connection_id, statement_id, message
.     FROM logs
.     WHERE message_level = 'ERROR'
.     AND timestamp BETWEEN '2020-01-01' AND '2020-01-06';
connection_id | statement_id | message
--------------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
           79 |           67 | Column type mismatch, expected UByte, got INT64 on column Number, file name: /home/sqream/nba.parquet

Use the connection_id and statement_id to narrow down the results.

t=>   SELECT database_name, message FROM logs
.       WHERE connection_id=79 AND statement_id=67 AND message_type_id=1;
database_name | message
--------------+--------------------------
master        | Query before parsing
master        | SELECT * FROM nba_parquet