Configuring SQream
The Configuring SQream page describes SQream’s method for configuring your instance of SQream and includes the following topics:
Overview
Modifications that you make to your configurations are persistent based on whether they are made at the session or cluster level. Persistent configurations are modifications made to attributes that are retained after shutting down your system.
Modifying Your Configuration
The Modifying Your Configuration section describes the following:
Modifying Your Configuration Using the Worker Configuration File
You can modify your configuration using the worker configuration file (config.json). Changes that you make to worker configuration files are persistent. Note that you can only set the attributes in your worker configuration file before initializing your SQream worker, and while your worker is active these attributes are read-only.
The following is an example of a worker configuration file:
{
“cluster”: “/home/test_user/sqream_testing_temp/sqreamdb”,
“gpu”: 0,
“licensePath”: “home/test_user/SQream/tests/license.enc”,
“machineIP”: “127.0.0.1”,
“metadataServerIp”: “127.0.0.1”,
“metadataServerPort”: “3105,
“port”: 5000,
“useConfigIP”” true,
“legacyConfigFilePath”: “home/SQream_develop/SqrmRT/utils/json/legacy_congif.json”
}
You can access the legacy configuration file from the legacyConfigFilePath
parameter shown above. If all (or most) of your workers require the same flag settings, you can set the legacyConfigFilePath
attribute to the same legacy file.
Modifying Your Configuration Using a Legacy Configuration File
You can modify your configuration using a legacy configuration file.
The Legacy configuration file provides access to the read/write flags used in SQream’s previous configuration method. A link to this file is provided in the legacyConfigFilePath parameter in the worker configuration file.
The following is an example of the legacy configuration file:
{
“developerMode”: true,
“reextentUse”: false,
“useClientLog”: true,
“useMetadataServer”” false
}
Session vs Cluster Based Configuration
Cluster-Based Configuration
SQream uses cluster-based configuration, enabling you to centralize configurations for all workers on the cluster. Only flags set to the regular or cluster flag type have access to cluster-based configuration. Configurations made on the cluster level are persistent and stored at the metadata level. The parameter settings in this file are applied globally to all workers connected to it.
For more information, see the following:
Using SQream SQL - modifying flag attributes from the CLI.
SQream Acceleration Studio - modifying flag attributes from Studio.
For more information on flag-based access to cluster-based configuration, see Configuration Flag Types below.
Session-Based Configuration
Session-based configurations are not persistent and are deleted when your session ends. This method enables you to modify all required configurations while avoiding conflicts between flag attributes modified on different devices at different points in time.
The SET flag_name command is used to modify flag attributes. Any modifications you make with the SET flag_name command apply only to your open session, and are not saved when it ends
For example, when the query below has completed executing, the values configured will be restored to its previous setting:
set spoolMemoryGB=700;
select * from table a where date='2021-11-11'
For more information, see the following:
Using SQream SQL - modifying flag attributes from the CLI.
SQream Acceleration Studio - modifying flag attributes from Studio.
The flag type attribute can be set for each flag and determines its write access as follows:
Administration: session-based read/write flags that can be stored in the metadata file.
Cluster: global cluster-based read/write flags that can be stored in the metadata file.
Worker: single worker-based read-only flags that can be stored in the worker configuration file.
The flag type determines which files can be accessed and which commands or commands sets users can run.
The following table describes the file or command modification rights for each flag type:
Flag Type |
Legacy Configuration File |
ALTER SYSTEM SET |
Worker Configuration File |
---|---|---|---|
Can modify |
Can modify |
Cannot modify |
|
Cannot modify |
Can modify |
Cannot modify |
|
Cannot modify |
Cannot modify |
Can modify |
Regular Flag Types
The following is an example of the correct syntax for running a Regular flag type command:
SET spoolMemoryGB= 11;
executed
The following table describes the Regular flag types:
Command |
Description |
Example |
---|---|---|
|
Used for modifying flag attributes. |
|
|
Used to preset either a specific flag value or all flag values. |
|
|
Used as a wildcard character for flag names. |
|
|
Used to print all flags with the following attributes:
|
|
|
Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. |
|
|
Used to show a specific flag/all flags stored in the metadata file. |
|
Cluster Flag Types
The following is an example of the correct syntax for running a Cluster flag type command:
ALTER SYSTEM RESET useMetadataServer;
executed
The following table describes the Cluster flag types:
Command |
Description |
Example |
---|---|---|
|
Used to storing or modifying flag attributes in the metadata file. |
|
|
Used to remove a flag or all flag attributes from the metadata file. |
|
|
Used to print the value of a specified value or all flag values. |
|
|
Used as a wildcard character for flag names. |
|
|
Used to print all flags with the following attributes:
|
|
|
Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. |
|
|
Used to show a specific flag/all flags stored in the metadata file. |
|
Worker Flag Types
The following is an example of the correct syntax for running a Worker flag type command:
SHOW spoolMemoryGB;
The following table describes the Worker flag types:
Command |
Description |
Example |
---|---|---|
|
Used to storing or modifying flag attributes in the metadata file. |
|
|
Used to remove a flag or all flag attributes from the metadata file. |
|
|
Used to print the value of a specified value or all flag values. |
|
|
Used as a wildcard character for flag names. |
|
|
Used to print all flags with the following attributes:
|
|
|
Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. |
|
|
Used to show a specific flag/all flags stored in the metadata file. |
|
All Configurations
The following table describes the Generic and Administration configuration flags:
Flag Name |
Access Control |
Modification Type |
Description |
Data Type |
Default Value |
---|---|---|---|---|---|
|
Administration |
Regular |
Sets the custom bin size in the cache to enable high granularity bin control. |
string |
|
|
Administration |
Regular |
Sets the pad device memory allocations with safety buffers to catch out-of-bounds writes. |
boolean |
|
|
Administration |
Regular |
Sets the runtime to pass only utility functions names to the compiler. |
boolean |
|
|
Administration |
Regular |
Sets the custom bin size in the cache to enable high granularity bin control. |
boolean |
|
|
Administration |
Regular |
Sets the hash table size of the CpuReduce. |
uint |
|
|
Administration |
Cluster |
Sets the maximum supported CSV row length. |
uint |
|
|
Administration |
Regular |
Sets the chunk size for copying from CPU to GPU. If set to 0, do not divide. |
uint |
|
|
Administration |
Regular |
Indicates if copying from/to GPU is synchronous. |
boolean |
|
|
Administration |
Worker |
Sets the percentage of total device memory to be used by the instance. |
uint |
|
|
Administration |
Regular |
Enables modifying R&D flags. |
boolean |
|
|
Administration |
Regular |
Activates the Nvidia profiler (nvprof) markers. |
boolean |
|
|
Administration |
Regular |
Enables creating and logging in the clientLogger_debug file. |
boolean |
|
|
Administration |
Regular |
Activates the Nvidia profiler (nvprof) markers. |
boolean |
|
|
Administration |
Regular |
Appends a string at the end of every log line. |
string |
|
|
Administration |
Regular |
Activates using varchar as an identifier. |
boolean |
|
|
Administration |
Cluster |
Sets the minimum size in mebibytes of extents for table bulk data. |
uint |
|
|
Administration |
Regular |
Monitors all pinned allocations and all memcopies to/from device, and prints a report of pinned allocations that were not memcopied to/from the device using the dump_pinned_misses utility function. |
boolean |
|
|
Administration |
Regular |
Increases the chunk size to reduce query speed. |
boolean |
|
|
Administration |
Regular |
Adds rechunker before expensive chunk producer. |
boolean |
|
|
Administration |
Regular |
Sets the buffer size. |
uint |
|
|
Administration |
Worker |
Manual setting of reported IP. |
string |
|
|
Administration |
Regular |
Sets the size of memory used during a query to trigger aborting the server. |
uint |
|
|
Administration |
Worker |
Sets the port used to connect to the metadata server. SQream recommends using port ranges above 1024† because ports below 1024 are usually reserved, although there are no strict limitations. Any positive number (1 - 65535) can be used. |
uint |
|
|
Administration |
Regular |
Splits large reads to multiple smaller ones and executes them concurrently. |
boolean |
|
|
Administration |
Regular |
Sets the number of workers to handle smaller concurrent reads. |
uint |
|
|
Administration |
Regular |
Sets the implicit cast in orc files, such as int to tinyint and vice versa. |
boolean |
|
|
Administration |
Regular |
Sets the timeout (seconds) for acquiring object locks before executing statements. |
uint |
|
|
Administration |
Worker |
Activates the machineIP (true). Setting to false ignores the machineIP and automatically assigns a local network IP. This cannot be activated in a cloud scenario (on-premises only). |
boolean |
|
|
Administration |
Regular |
Interprets decimal literals as Double instead of Numeric. Used to preserve legacy behavior in existing customers. |
boolean |
|
|
Administration |
Regular |
Interprets ASCII-only strings as VARCHAR instead of TEXT. Used to preserve legacy behavior in existing customers. |
boolean |
|
|
Generic |
Regular |
Reorders join to force equijoins and/or equijoins sorted by table size. |
boolean |
|
|
Generic |
Worker |
Prevents a query from processing more memory than the flag’s value. |
uint |
|
|
Generic |
Regular |
Sets how long the cache stores contents before being flushed. |
size_t |
|
|
Generic |
Regular |
Sets the ondisk directory location for the spool to save files on. |
size_t |
Any legal string |
|
Generic |
Regular |
Sets the amount of memory (GB) to be used by Spool on the disk. |
size_t |
|
|
Generic |
Regular |
Sets the number of partitions that the cache is split into. |
size_t |
|
|
Generic |
Regular |
Sets the persistent directory location for the spool to save files on. |
string |
Any legal string |
|
Generic |
Regular |
Sets the amount of data (GB) for the cache to store persistently. |
size_t |
|
|
Generic |
Regular |
Sets the amount of memory (GB) to be used by Spool InMemory. |
size_t |
|
|
Generic |
Regular |
Determines the client log level: 0 - L_SYSTEM, 1 - L_FATAL, 2 - L_ERROR, 3 - L_WARN, 4 - L_INFO, 5 - L_DEBUG, 6 - L_TRACE |
uint |
|
|
Generic |
Regular |
Sets the CPU to compress columns with size above (flag’s value) * (row count). |
uint |
|
|
Generic |
Regular |
Sets the name of the session tag. |
string |
Any legal string |
|
Generic |
Regular |
Sets the amount of memory (GB) to be used by the server for spooling. |
uint |
|
The configuration commands are associated with particular flag types based on permissions.
The following table describes the commands or command sets that can be run based on their flag type. Note that the flag names described in the following table are described in the Configuration Roles section below.
Flag Type |
Command |
Description |
Example |
---|---|---|---|
Regular |
|
Used for modifying flag attributes. |
|
Cluster |
|
Used to storing or modifying flag attributes in the metadata file. |
|
Cluster |
|
Used to remove a flag or all flag attributes from the metadata file. |
|
Regular, Cluster, Worker |
|
Used to print the value of a specified value or all flag values. |
|
Regular, Cluster, Worker |
|
Used as a wildcard character for flag names. |
|
Regular, Cluster, Worker |
|
Used to print all flags with the following attributes:
|
|
Regular, Cluster, Worker |
|
Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. |
|
Regular, Cluster, Worker |
|
Used to show a specific flag/all flags stored in the metadata file. |
|
SQream divides flags into the following roles, each with their own set of permissions:
Administration flags: can be modified by administrators on a session and cluster basis using the
ALTER SYSTEM SET
command.Generic flags: can be modified by standard users on a session basis.
SQream uses the sqream_catalog.parameters catalog table for showing all flags, providing the scope (default, cluster and session), description, default value and actual value.
The following is the correct syntax for a catalog table query:
SELECT * FROM sqream_catalog.settings
The following is an example of a catalog table query:
externalTableBlobEstimate, 100, 100, default,
varcharEncoding, ascii, ascii, default, Changes the expected encoding for Varchar columns
useCrcForTextJoinKeys, true, true, default,
hiveStyleImplicitStringCasts, false, false, default,
This guide covers the configuration files and the SET
statement.