Overview

This guide is intended for SQream DB administrators.

The guide will go through the DBA main tasks, as well as describing some of the best practices in SQream DB.

This guide is a complementary guide to the SQL Reference.

For further support please contact [email protected] or your account manager.

1. Administration Guide

This document is mostly a conceptual overview and recommendations on best practices regarding SQream internal behavior. For SQL syntax and supported features, please refer to the SQL Reference Guide.

All keywords in are case insensitive.

2. Concepts

This section describes SQream’s database concepts

2.1. SQream DB Daemon

In SQream DB, the sqreamd or SQream daemon is the server that deals with most of the operations against a certain GPU. Certain installations may have several sqreamd running, each with their own GPU ID.

A sqreamd may run with a specific storage cluster, or be teamed up with others, sharing a shared cluster.

2.2. Storage Cluster

A SQream DB Storage Cluster is a collection of all stored objects:

A server instance can only run against a single storage cluster at one time. Clusters can be changed, but require a restart of the daemon.

A cluster will be created on installation, and shouldn’t require any intervention during normal operation

2.3. Databases

A storage cluster may have many databases residing in it.

When you create an applicative connection (from a client or JDBC/ODBC) – you connect to a single database.

A database can have many Schemas and Tables.

Create different databases for different use-cases

To view existing databases, query the Catalog (information schema).

2.4. Schemas

Schemas are part of the effective table name. The default schema is called public.

Use schemas to split up the data tables logically
schema
Figure 1. Example of schema usage

2.5. Tables

In SQream DB, a table is a set of data elements organized into horizontal rows and vertical columns. A table has a specified number of columns but can have any number of rows.

table
Figure 2. Logical table 'customers'

In a row-oriented database, would be stored as:
1,John,1997-01-01 15:30:30 ; 2,Jane,1998-05-06 00:00:00; …​

In a columnar database, would be stored as:
1,2,3,4,5 ;
John,Jane,Sam,David,Susan ;
1997-01-01 15:30:30,1998-05-06 00:00:00,1956-06-02 14:31:00,2006-12-25 14:40:00,1975-10-21 02:20:00

Instead of pulling the entire row (all columns) every time, only select the columns you need.
SQream DB automatically removes columns not necessary for the calculations.

Example
SELECT name, MAX(ts) FROM customers GROUP BY name;

2.6. External Tables

External tables are structured DDL that allow SQream to access data that which is stored outside the database in a none-SQream format, and to query it via SQL commands. Upon creation, the user should specify the external files format and location, and the needed table DDL. Once created, SQream will query the tables, as if they were regular tables.

2.7. Columns

In SQream DB, a column is the element that makes up a table.

Each column has a specific type and name, that do not change. By physically organizing data by columns, SQream DB is able to scan and aggregate data for individual columns faster, because less data needs to be read. This makes columns highly suitable for analytical workloads.

2.8. Chunks, Compression and Metadata

2.8.1. Chunks

SQream DB splits up columnar data into chunks. The chunk size will be the minimal bulk load into the GPU.

For better performance, SQream DB rearranges previously loaded data in new chunks based on the desired chunk size. While loading data into SQream DB, each bulk load (either INSERT INTO or COPY), will generate its own chunks (with sizes up to the chunk size).

The chunk size is a parameter at the cluster level that must be set before the first insert to the cluster. The default chunk size 1 million rows. The parameter can be set at the database level, before any tables are created. Ask your database administrator about setting the chunk size.

  • SQream DB, in its own time, will rechunk those chunks into the desired chunk size. During the rechunk operation SQream will rearrange and recompress the data using the most appropriate compression type (according to the data type and data distribution).

  • The chunk size has an influence on load/query time. Before tuning the parameter, consult your SQream account manager.

2.8.2. Compression specs

When DEFAULT compression spec (or no compression spec) is specified, each chunk may be compressed in a variety of different formats, based on the system’s understanding. You may override the compression spec, but this is not recommended.

See Compression types in the SQL Manual for more information.

2.8.3. Metadata

SQream DB gathers and saves metadata information regarding the columns data at the chunk level during COPY. This information will serve the SQream optimizer while doing Data Skipping and other optimizations. This metadata is gathered automatically and transparently. It requires no user intervention.

2.9. Catalog (information schema)

The SQream DB catalog or information schema consists of views that contain information about all database objects. This provides access to database metadata, column types, tables and their row-counts, etc.

The catalog structures are specific to SQream DB.

2.9.1. Querying the SQream catalog

The catalog is available from any database, under the schema called sqream_catalog. You may query the schema as you would any other table in the system.

You can not perform any other operations on the catalog, like INSERT, DELETE, …​
Example
demo_db=> SELECT * from sqream_catalog.tables;
Table 1. Example result for a demo database
database_name table_id schema_name table_name row_count_valid row_count rechunker_ignore
demo_db
0
public
nation
1
25
0
demo_db
1
public
region
1
5
0
demo_db
2
public
part
1
20000000
0
demo_db
3
public
supplier
1
1000000
0
demo_db
4
public
partsupp
1
80000000
0
demo_db
5
public
customer
1
15000000
0
demo_db
6
public
orders
1
300000000
0
demo_db
7
public
lineitem
1
600037902
0
Example for identifying delete predicates on tables
demo_db=> select t.table_name,d.* from sqream_catalog.delete_predicates d
.> inner join sqream_catalog.tables t on
.> d.table_id=t.table_id;

2.9.2. Available catalog views

Database object catalog
Table 2. SQream catalog views
View name Description
 sqream_catalog.databases

All database objects in the current storage cluster

 sqream_catalog.schemas

Schema objects in the database

 sqream_catalog.tables

Table objects in the database

 sqream_catalog.external_tables

External table objects in the database

 sqream_catalog.views

View objects in the database

 sqream_catalog.columns

Column objects in the current database

 sqream_catalog.catalog_tables

All catalog views available

Fine-grain storage catalog
View name Description
 sqream_catalog.extents

Extent storage objects in the database

 sqream_catalog.chunks

Chunk storage objects in the database

 sqream_catalog.delete_predicates

Logical delete predicates added to the compiler with a DELETE command

Role and permission catalog
sqream_catalog.roles Roles (users) in the current databases
 sqream_catalog.role_memberships

Roles membership

 sqream_catalog.table_permissions

Tables and their assigned roles

 sqream_catalog.database_permissions

Databases and their assigned roels

 sqream_catalog.schema_permissions

Schemas and their assigned roles

 sqream_catalog.permission_types

Permission types

Database object catalog
Databases view
Table 3. Databases view
Column name Type Description
 database_id
 varchar

Database ID

 database_name
 varchar

Database name

 default_disk_chunk_size
 bigint

Storage chunk size (in number of rows)

 default_process_chunk_size
 bigint

Process chunk size (in number of rows)

 rechunk_size
 bigint

Internal use

 storage_subchunk_size
 bigint

Internal use

 compression_chunk_size_threshold
 bigint

Internal use

Schemas view
Table 4. Schemas view
Column name Type Description
 schema_id
 varchar

Schema ID

 schema_name
 varchar

Schema name

 schema_owner
 varchar

Role who owns this schema

 rechunker_ignore
 bigint

Internal use

Tables view
Table 5. Tables view
Column name Type Description
 database_name
 varchar

Owning database name

 table_id
 varchar

Table ID

 schema_name
 varchar

Owning schema name

 table_name
 varchar

Table name

 row_count_valid
 bool

See warning below

 row_count
 bigint

Number of rows in the table

 rechunker_ignore
 int

Internal use

When row_count_valid is 0 (after a DELETE operation), the row count may be inaccurate. To get the accurate row-count, run
SELECT COUNT(column) FROM table;
External tables view
Table 6. External tables view
Column name Type Description
 database_name
 varchar

Owning database name

 table_id
 varchar

External table ID

 schema_name
 varchar

Owning schema name

 table_name
 varchar

External table name

 format
 int

0=CSV, 1=Parquet

 created
 varchar

Creation data as a string

Views view
Table 7. Views view
Column name Type Description
 view_id
 varchar

View ID

 view_schema
 varchar

Owning schema name

 view_name
 varchar

The view name

 view_data
 varchar

Internal use

 view_query_text
 varchar

Full statement test that created this view

Columns view
Table 8. Columns view
Column name Type Description
 database_name
 varchar

Owning database name

 schema_name
 varchar

Owning schema name

 table_id
 varchar

Owning table ID

 table_name
 varchar

Owning table name

 column_id
 int

Column ID

 column_name
 varchar

The column name

 type_name
 varchar

Column type

 column_size
 bigint

Column data size in bytes

 has_default
 int

Indicates wether or not the column has a default

 default_value
 varchar

Indicates the default column value

 compression_strategy
 varchar

User-overriden compression strategy

 created
 varchar

Creation date

 altered
 varchar

Last alter date

Fine-grain storage catalog
Extent view
Table 9. Extent view
Column name Type Description
 database name
 varchar

Owning database name

 table_id
 varchar

Owning table ID

 column_id
 bigint

Owning column ID

 extent_id
 bigint

The extent ID

 size
 bigint

Size of the extent in MB

 path
 varchar

Full path to the extent file on disk

Chunks view
Table 10. Chunks view
Column name Type Description
 database name
 varchar

Owning database name

 table_id
 varchar

Owning table ID

 chunk_id
 bigint

The chunk ID

 rows_num
 bigint

The amount of rows in this specific chunk

 deletion_status
 bigint

This chunk’s deletion mark. 0 means keep, 1 means chunk needs partial deletion, 2 means delete entire chunk.

Delete predicates view
Table 11. Delete predicates view
Column name Type Description
 database name
 varchar

Owning database name

 table_id
 varchar

Owning table ID

 max_chunk_id
 bigint

The highest chunk_id seen during the DELETE time

 delete_predicate
 varchar

The predicates added by the compiler (one predicate-statement per row in this view)

Role and permission catalog
Role view
Table 12. Role view
Column name Type Description
 role_id
 bigint

The role ID

 role_name
 varchar

Role name

 superuser
 bool

1 for superuse, 0 otherwise

 login
 bool

1 if the role has login permission, 0 otherwise.

 role_login
 bool

Does this role have the login permission?

Role membership view
Table 13. Role membership view
Column name Type Description
 role_id
 int

The role ID

 assigned_role_id
 int

Internal role ID

 inherit
 bool

1 for inherit permission, 0 otherwise.

Table permission view
Table 14. Table permission view
Column name Type Description
 database name
 varchar

Owning database name

 table_id
 bigint

Owning table ID

 role_id
 bigint

The role ID

 permission_type
 int

The permission type

Database permission view
Table 15. Database permission view
Column name Type Description
 database name
 varchar

Owning database name

 role_id
 bigint

The role ID

 permission_type
 int

The permission type

Schema permission view
Table 16. Schema permission view
Column name Type Description
 database name
 varchar

Owning database name

 schema_id
 bigint

Owning schema ID

 role_id
 bigint

The role ID

 permission_type
 int

The permission type

Permission type view
Table 17. Permission type view
Column name Type Description
 permission_type_id
 bigint

The permission type ID

 name
 varchar

Permission name

2.10. Locks

SQream DB operates in two modes: exclusive, which sends a single operation at a time, and inclusive which is a multi operations mode. DDL operations are always exclusive.

DML are separated to DELETE/TRUNCATE as exclusive; and INSERT as inclusive. This allows multiple inserts into the table, but prevents multiple DELETE operations.

Querying (SELECT operations) can coexists with both DDL and DML.

2.10.1. Locking

Table 18. Locks by SQream

Operation

Select

DML (Insert)

DML (Delete/Truncate)

DDL

Select

No lock

No lock

No lock

No lock

DML (insert)

No lock

No lock

No lock

Lock

DML (delete/truncate)

No lock

No lock

Lock

Lock

DDL

No lock

Lock

Lock

Lock

By default, when a session is requesting a lock on an object and the object is busy, SQream will wait 3 seconds before it return an error message. This wait time is defined in the configuration JSON. See the statementLockTimeout parameter in SQream Administrator Guide for more information.

DDL on an object will prevent other DDL/DML to wait on a lock on the same object.

For specific DDL operations, SQream uses global permissions that requires very short exclusive locks at the cluster level. Global permission will be used on operation such as CREATE DATABASE/TABLE/ROLE, ALTER ROLE/TABLE, DROP ROLE/TABLE/DATABASE, GRANT/REVOKE.

2.10.2. Viewing locks

To view all existing locks in the SQream database use the utility function show_locks() :

Example
SELECT show_locks();

2.10.3. Releasing locks

To release a specific lock in an active SQream instance, use the stop_statement() utility function with the relevant statement_id. Use the statement_id returned by the show_locks() utility function.

Example
SELECT stop_statement(12009);

To release all locks in a suspect/inactive SQream instance, use the utility function release_defunct_locks() which will remove the instance from the cluster and release all its resources. To see all SQream instances status, use show_cluster_nodes().

The utility function release_defunct_locks() works only for a system running the metadata server. In a single instance (no metadata server) it will not work.
Example
SELECT show_cluster_nodes();
SELECT release_defunct_locks();

2.11. Workload Services

SQream will distribute work throughout the hardware resources to maximize the hardware utilization. By default, this distribution will be done in an equal manner.
The DBA can change this setting and optimize the utilization to their needs by using SQream workload services and defining each SQream instance to specific SERVICE/s.

2.11.1. Managing Services

Monitor services subscription

select show_subscribed_instances() ;

Add services to an existing instance

select subscribe_service('instance_id', 'service_name'); ;

Example
select subscribe_service('node_11', 'etl_service');
Remove services to an existing instance

select unsubscribe_service('instance_id', 'service_name'); ;

Example
select unsubscribe_service('node_11', 'etl_service');

3. Getting Started

3.1. Understanding the SQream DB environment

The SQream environment is usually made up of two folders - the installation directory and the storage cluster.

The installation directory

This folder contains the SQream DB binary applications.

gen sqream directory

The main applications found in the binary directory are:

Executable Name Description
 sqreamd

The sqreamd server daemon

 ClientCmd

Command line client

 upgrade_storage

Metadata storage upgrader to be used between versions

 SqreamStorage

Storage utility to create new clusters and restore access

 metadata_server

Metadata server for clustered installations

 server_picker

Load balancer for clustered installations

The storage cluster root

This directory contains the entire database storage.

gen cluster root
Directory Description
 databases

The files containing all databases, tables, columns, chunks, etc.

 metadata

Internal metadata structures for accessing data from disk

 temp

Temporary spools

 logs

Logs generated by the servers for access and statements

3.2. Setting up SQream DB

If you would like to deploy SQream DB on AWS or Azure please contact SQream directly at info@sqream.com.

Make sure that you are running on a 64-bit Linux operating system, and have an Nvidia GPU installed.
See the Hardware Requirements page to verify your system meets the minimum requirements for running SQream DB.

gen connecting to sqream

3.2.1. BIOS and RAID (Checklist)

Out-of-band management interface
  • Ensure connection to management console, Linux host and verify credentials

RAID
  • Ensure all drives are connected and appear in RAID interface

  • Configure RAID drives as required

BIOS settings
Ignore settings where not applicable - some are Dell or HP specific
  • (For Dell PowerEdge servers) Enable Memory Map I/O Over 4GB

  • Set power profile to maximum performance

  • Set power regulator to high performance mode

  • Enable Intel Turbo Boost and Hyperthreading

  • Disable Intel Virtualization Technology

  • Disable Intel VT-d

  • Disable processor C-States (Minimum processor idle power core state)

  • Set Energy/Performance bias to maximum performance

  • Disable dynamic power capping

  • Set DIMM voltage to Optimized for Performance

  • Set memory power savings mode to Maximum performance

  • Enable ACPI SLIT

  • Set QPI Snoop configuration to Home-Snoop or Early-Snoop

3.2.2. OS

We recommend CentOS 7.3, Amazon Linux 2017.03 or Ubuntu 16.04.

  • For CentOS 7.3, install DEVEL with basic video driver

  • Edit your fstab to mount all RAID LUNs as required

  • Configure networking as desired

  • Add sqream user and set password:

sudo useradd -m -U sqream
sudo passwd sqream
  • Edit sudoers with ‘sudo visudo’ and add the following last line:

sqream ALL=(ALL)    ALL
  • Set UTF8 as system language

export LANG=en_US.UTF-8
  • Disable X/GUI

systemctl set-default multi-user.target
  • Add some vm settings to sysctl to prevent kernel panic

echo -e " fs.file-max=200000\n vm.dirty_background_ratio = 5 \n vm.dirty_ratio = 10 \n vm.swappiness = 10 \n vm.zone_reclaim_mode = 7 \n vm.vfs_cache_pressure = 200 \n"  >> /etc/sysctl.conf
  • Open network ports:
    For each installation of SQream DB, make sure ports 3105, 3108 and 5000 are open.
    Port 3105 is used by metadata_server, port 3108 is used by server_picker, and a single port is
    used for every sqreamd instance (5000 is the default port).
    You may also choose to disable firewalld completely.

  • Disable SELINUX:
    Configure SELINUX=disabled in the /etc/selinux/config file.

3.2.3. Obtain the tarball from SQream

  • Obtain the latest compressed SQream DB tarball from SQream.

3.2.4. Install prerequisite libraries and kernel headers

On CentOS/RHEL
sudo yum install -y zlib-devel openssl openssl-devel nano bzip2 bzip2-devel \
python python-devel libffi libffi-devel apr apr-util apr-devel apr-util-devel \
readline readline-devel gmp-devel libmp3-devel wget gcc kernel-headers kernel-devel

Reboot your machine to enable the new kernel.

On Ubuntu
sudo apt-get install zlib1g-dev libssl libssl-dev libbz2 libbz2-dev python \
python-dev libffi libffi-dev libapr1 libapr1-dev libreadline6 libreadline6-dev \
libgmp libgmp-dev wget gcc linux-headers-`uname -r`

Reboot your machine to enable the new kernel.

3.2.5. Install the latest NVIDIA driver for your GPU

  1. Stop any running UI by stopping X
    (sudo init 3 or sudo systemctl set-default multi-user.target)

  2. Get the latest NVIDIA driver
    wget https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda_8.0.44_linux-run

  3. Run the installer
    sudo sh ./cuda_8.0.44_linux-run

3.2.6. Install the SQream DB daemon on the machine

  1. Change to the newly created sqream user:
    su sqream

  2. Unpack the tarball. This expands into the sqream directory.

  3. Unpacking tar -xf sqream-<version>.tar.gz

3.2.7. Create the cluster

  1. Enter the sqream directory and create a storage cluster

  2. ./bin/SqreamStorage -C -r <full path to new cluster>

Example: Create storage cluster

If your main storage is on /mnt/storage, you can run
./bin/SqreamStorage -C -r /mnt/storage/sqream_storage


3.2.8. Configure the instances

Each SQream DB daemon must run against a configuration file.

Below is a sample minimal configuration file. It is recommended that it be placed in /etc/sqream/sqream_config.json. This file will start the server on port 5000, against GPU #0.

Example minimal configuration JSON
{
    "compileFlags": {
    },
    "runtimeFlags": {
    },
    "runtimeGlobalFlags": {
    },
    "server": {
        "licensePath" : "/etc/sqream/license.enc",
        "port": 5000,
        "cluster": "/mnt/storage/sqream_storage",
        "gpu": 0,
        "ssl_port": 5100
    }
}
  • JSON files can not contain any comments

  • When altering the JSON file, pay close attention to the field-separating commas at the end of the lines.

Permanent instance options may be placed in this file, based on consultation with SQream support or through the configuration utility.

3.2.9. Configuring instances parameters

Table 19. Common parameters setting
Flag Name Description Default Value Range of Values Remark
spoolmemorygb

Select what size spool SQream DB can use for writing intermediate results to RAM, in GB

 128gb
 1-machine ram size

Should consider total ram size and number of SQream instances

insertParsers

Set the number of parsing threads to be launched for each file, during the bulk load process

 4
 1-32

Should consider total ram size and number of SQream instances

insertCompressors

Set the number of compression threads to be launched for each file, during the bulk load process

 4
 1-32

Should consider total ram size and number of SQream instances

statementLockTimeout

Set the number of seconds SQream will wait for a lock before returning an error

 3
 1-no limit
showFullExceptionInfo

Show complete error message

 False
 TRUE/FALSE

Enabling this setting will often show more detailed error message.

isUnavailableNode

Exclude node from load balancer

 False
 TRUE/FALSE

3.2.10. Starting and Stopping the SQream DB daemon

Service

The recommended way to run the daemon is via the service, which runs it through monit:

Service permissions

Altering the service state requires superuser access (sudo), but the SQream DB daemon will start as an unprivlieged user.

Starting the service
Starting the service
sudo service sqream start
Stopping the service
Stopping the service
sudo service sqream stop
Service permissions

Altering the service state requires superuser access (sudo), but the SQream DB daemon will start as an unprivlieged user.


Identifying which SQream daemons are running

There are several methods for identifying running SQream daemons. In these examples, two daemons with metadata server and server-picker running.

Using the service script
~ $ sudo service sqream status

Metadata Server is running, with pid 41203                 [  OK  ]
Server picker is running at pid 41695                      [  OK  ]
SQream DB is running, with pid 41309.                      [  OK  ]
Running on GPU 0, listening on port 5000
SQream DB is running, with pid 41444.                      [  OK  ]
Running on GPU 1, listening on port 5001
Using ps
~ $ ps aux | grep sqreamd
sqream    9841  0.0  0.0 112656   972 pts/0    S+   12:42   0:00 grep --color=auto sqreamd
sqream   41309  1.5  0.7 279354084 1037804 ?   Sl   Jun01 110:20 /home/sqream/sqream/bin/sqreamd -config /etc/sqream/sqream1_config.json
sqream   41444  0.0  0.4 248082604 547636 ?    Sl   Jun01   4:07 /home/sqream/sqream/bin/sqreamd -config /etc/sqream/sqream2_config.json
Using nvidia-smi
~ $ nvidia-smi
Tue Jun  6 12:43:17 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K40m          On   | 0000:04:00.0     Off |                   0* |
| N/A   48C    P0    67W / 235W |  11309MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40m          On   | 0000:42:00.0     Off |                   0* |
| N/A   54C    P0    68W / 235W |  11309MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     41309    C   /home/sqream/sqream/bin/sqreamd              11305MiB |
|    1     41444    C   /home/sqream/sqream/bin/sqreamd              11305MiB |
+-----------------------------------------------------------------------------+
Restarting a specific server daemon

Once you have identified a server that needs restarting, you may control it directly while keeping the rest of the cluster up.

Restarting SQream daemon #2
sudo service sqream2 restart

Temporarily running the daemon

You may start the daemon temporarily from any unprivileged user.

Running as superuser

Running the server daemon as superuser is discouraged, as it may expose the entire filesystem to SQL users. SQream DB is designed to run as an unprivileged user.

Starting the daemon temporarily

To start the daemon temporarily, run it with nohup:

Starting the daemon temporarily
nohup ./bin/sqreamd -config /etc/sqream/sqream_config.json &
Stopping the daemon temporarily

It is safe to shut down the server by killing it from the operating system; the system is designed to handle this. Any data commited will remain consistent, while uncommited data will be removed upon the daemon’s next start-up.

Stopping a temporarily running server from the command line
kill -9 `pid of sqreamd`
Stopping a temporarily running server from the SQL interface
SELECT shutdown_server();
Overriding parameters

You may override the port, GPU and cluster parameters while retaining a configuration file settings by running

./bin/sqreamd <path to different cluster> <GPU #> <Port #> -config <path to configuration json>

3.2.11. Connect to the SQream DB server with ClientCmd

To connect to the database server, you may use the ClientCmd command line interface by running

ClientCmd command line arguments
./bin/ClientCmd --user=<username> --password=<password> --database=<database name> --host=<host> --port=<port> --service=<service_name>

Connect to the master database:

Connecting to master on a local host running SQream DB on port 5000 via service 'etl_service'
./bin/ClientCmd --user=sqream --password=sqream --database=master --host=127.0.0.1 --port=5000 --service=etl_service

3.2.12. Using SSL Server Authentication with SQream

From version 2.1, SQream DB supports secure sockets layer (SSL) encryption and authentication for connections to its cluster via JDBC and ODBC drivers.

To use this option, the server must first be set-up to accept SSL connections.

Enabling SSL server authentication within SQream Instance:

Configure the SERVER flag ssl_port in the SQream instance configuration file to the needed port number.

For example:

{
"server":{
    "port": 5001,
    "ssl_port": 5100,
    "cluster": "/path/sqream_cluster",
    "gpu": 0,
    "licensePath": "/path/license.enc"
}
}
Restart the SQream DB daemons after making this configuration change
Configure SSL Authentication for JDBC/ODBC drivers
Adding SSL to JDBC
  1. Add ssl=true in the connection string

  2. Change the port to the SSL port

When connecting via load balancer, the default endpoint for SSH is port 3109
Example for direct connection
jdbc:Sqream://hostname:5100/master;user=sqream;password=mypassword;ssl=true;
Example for connection to load balancer
jdbc:Sqream://hostname:3109/master;user=sqream;password=mypassword;service=sqream;cluster=true;ssl=true;
Adding SSL to ODBC

In Windows, make sure the SSL checkbox is selected in the DSN settings.

In Linux, add Ssl=true to the connection string

Linux ODBC connection string sample
Driver={libODBCDrv.so}:Server=hostname:Port=5100:Database=master:User=sqream:Password=mypassword:Ssl=true:Service=sqream

3.3. Highly available installations

Contact your SQream representative for further information about installing our highly available solutions.

4. Operations

4.1. Upgrading a version

Here are the necessary steps that ensure a smooth upgrade of your SQream DB version

  1. Stop SQream instances on all servers

  2. On each node that SQream is installed, unpack the new tarball
    alongside the old SQream DB directory.
    For example:

$ cd /home/sqream
$ mv sqream sqream-old
$ tar xf sqream-<version>.tar.gz
  1. Repeat the above step for each node that the SQream DB executables exist

  2. It may be necessary to run the metadata upgrade utility.
    (This may take a few moments)

$ cd sqream/bin
$ ./upgrade_storage <path to sqream storage cluster>

4.2. Key administration concepts

See Concepts above

4.3. Monitoring the system

Because SQream DB can be run in a distributed setting, all nodes should be monitored to ensure smooth operation. It is possible to monitor SQream DB with third party tools like Zabbix, Nagios and others, but also through the OS and SQream DB directly.

4.3.2. From each node

See connections to the server

You can monitor existing connections to the database by using the show_connections() utility function:

SELECT show_connections();
Table 20. Sample result from show_connections()
ip conn_id conn_start_time stmt_id stmt_start_time stmt

192.168.0.93

19

2017-06-22 18:56:54

14

2017-06-22 18:56:54

select show_connections()

192.168.0.93

17

2017-06-22 18:56:48

-1

2017-06-22 18:56:48

Show server / cluster status

The show_server_status() utility function can be used to see which statements are running across the cluster, across all databases.

If no queries are running, this query will return 0 rows in the result set.
SELECT show_server_status();
Table 21. Sample result from show_server_status()
connection_id server ip server port database_name user_name client ip statement id statement statement start time statement status statement status start

32

192.168.0.93

5000

faa

sqream

192.168.0.1

25

SELECT Year

Carrier

destCityName

COUNT( DISTINCT originCityName) from ontime JOIN l_airport_i

Possible statement status values

Status Description

Executing

The statement is in execution, awaiting results

Preparing

The statement is compiling, and is awaiting execution

Waiting

The statement is waiting on resources, like a GPU to free up

The DBA can use the show server status output as a baseline for identifying locks and if needed to stop running statements (based on the server ip : server port and statement_id columns).

By running a query

Running a query, even the most basic one, should give you an indication if a server is up. If you immediately get "Connection refused" or similar, the server is down.

SELECT 1;

4.4. Stopping existing statements

The stop_statement() utility function can be used to cancel or stop a running statement before it finishes.

Usage
  • Identify the running statement ID and server IP and port (see show server status or show_connections above)

  • From the same server/port combination - run the stop_statement command:

SELECT stop_statement(42);

4.5. Logs

SQream DB generates the following log files for the DBA everyday work

Log files:
  • clientLogger_statement.log - The log file keeps track of statements being run in the server, including success/failure, number of rows returned from the query and number of rows processed by the query.

  • clientLogger_server.log - Keeps track of server start/stop times

  • clientLogger_session.log - Keeps track of different sessions

  • clientLogger_login.log - Tracks user login times and IP addresses

  • clientLogger_exception.log - Tracks system exceptions

  • clientLogger_execution.log - Tracks heavy statements execution statistics (over nodeInfoLoggingSec flag)

From V2.3 each instance in the SQream cluster can generate its own set of log file. Related runtimeGlobalflags:

  • useClientLog : Set this flag to true to activate the new logging framework.

  • useLogMaxFileSize : Enable/disable rolling files.

  • logMaxFileSizeMB: Rollover size parameter. Set in MB. Default is 20MB.

  • useLogFileRotateHourOfDay: Enable/disable new log files at a specified hour of the day (e.g., create a new set of files every day, after midnight).

  • logFileRotateHourOfDay: Rollover time parameter. Set in the hour of the day. Range of values: 0-23. Default is 0 (midnight).

Copyright © 2010-2018. All rights reserved.

This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchant- ability or fitness for a particular purpose.

We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document.

This document may not be reproduced in any form, for any purpose, without our prior written permission.