SQLoader As a Service

The SQLoader is a Java service that enables you to ingest data into SQreamDB from other DBMS and DBaaS through HTTP requests using network insert.

SQLoader supports ingesting data from the following DBMSs:

  • Greenplum

  • Microsoft SQL Server

  • Oracle (including Oracle Autonomous Database)

  • Postgresql

  • SAP HANA

  • Sybase

  • Teradata

  • SQreamDB 4.5.15 or later (Full Load only)

Before You Begin

It is essential that you have the following:

Minimum Hardware Requirements

Component

Type

CPU cores

16

RAM

32GB

Sizing Guidelines

The SQLoader sizing is determined by the number of concurrent tables and threads based on the available CPU cores, limiting it to the number of cores minus one, with the remaining core reserved for the operating system. Each SQLoader request runs on a single table, meaning concurrent imports of multiple tables require multiple requests. Additionally, it is important to note that for partitioned tables, each partition consumes a thread. Therefore, for performance efficiency, considering the table’s partition count when managing thread allocation is a must.

Compute formula: \(⌊ 0.8 * (TotalMemory - 4) ⌋\)

Installation and Connectivity

Getting All Configuration and JAR Files

  1. Download the SQLoader zip file:

    https://storage.cloud.google.com/cicd-storage/sqloader_release/sqloader-release-v1.1.zip
    
  2. Extract the .tar file using the following command:

    tar -xf sqloader_srv_v8.2.tar.gz
    

    A folder named sqloader with the following files is created:

    ├── sqloader-v1.sh
    ├── bin
    │   ├── sqloader-admin-server-1.1.jar
    │   └── sqloader-service-8.2.jar
    ├── config
            ├── reserved_words.txt
            ├── sqload-jdbc.properties
            └── sqream-mapping.json
    

File Name

Description

sqream-mapping.json

Maps foreign DBMS and DBaaS data types into SQreamDB data types during ingestion

sqload-jdbc.properties

Used for defining a connection string and may also be used to reconfigure data loading

reserved_words.txt

A list of reserved words which cannot be used as table and/or column names.

sqloader-service-8.2.jar

The SQLoader service JAR file

sqloader-admin-server-1.0.jar

The SQLoader admin server JAR file

sqloader-v1.sh

SQLoader service installer bash file

Installation

Deployment Parameters

When using the sqloader-v1.sh file (installer), the following flags are already configured.

All deployment flags are not dynamically adjustable at runtime.

Parameter

State

Default

Example

Description

configDir

Optional

config

java -jar sqloaderService-8.2.jar --configDir=</path/to/directory/>

Defines the path to the folder containing both the data type mapping and the reserved words files. The defined folder must contain both files or else you will receive an error. This flag affects the mapping and reserved words files and does not affect the properties file

hzClusterName=<TEXT>

Optional

java -jar sqloader-service-8.2.jar --hzClusterName=<TEXT>

In Hazelcast, a cluster refers to a group of connected Hazelcast instances across different JVMs or machines. By default, these instances connect to the same cluster on the network level, meaning that all SQLoader services that start on a network will connect to each other and share the same queue. An admin can connect to only one Hazelcast cluster at a time. If you start multiple clusters and want to connect them to the admin service, you will need to start multiple admin services, with each service connecting to one of your clusters. It is essential that this flag has the same name used here and across all SQLoader instances.

LOG_DIR

Optional

logs

java -jar -DLOG_DIR=/path/to/log/directory sqloader-service-8.2.jar

Defines the path of log directory created when loading data. If no value is specified, a logs folder is created under the same location as the sqloader.jar file

spring.boot.admin.client.url

Optional

http://localhost:7070

java -jar sqloader-service-8.2.jar --spring.boot.admin.client.url=http://IP:PORT

SQLoader admin server connection flag

Xmx

Optional

java -jar -Xmx<number>g sqloader-service-8.2.jar

We recommend using the Xmx flag to set the maximum heap memory allocation for the service. If a single service is running on the machine, we suggest allocating 80% of the total memory minus approximately 4GB, which the service typically needs on average. If multiple services are running on the same machine, calculate the recommended heap size for one service and then divide it by the number of services. Compute formula: \(⌊ 0.8 * (TotalMemory - 4) ⌋\)

DEFAULT_PROPERTIES

Mandatory

sqload-jdbc.properties

java -jar -DDEFAULT_PROPERTIES=/path/to/file/sqload-jdbc.properties sqloader-service-8.2.jar

When the service initializes, it looks for the variable DEFAULT_PROPERTIES, which corresponds to the default sqload-jdbc.properties file. Once the service is running with a specified properties file, this setting will remain unchanged as long as the service is operational. To modify it, you must shut down the service, edit the properties file, and then restart the service. Alternatively, you can modify it via a POST request, but this change will only affect the specific load request and not the default setting for all requests.

Installing the Admin Server and SQLoader Service

  1. To install the admin server, run the following command (install it only once on one machine):

sudo ./sqloader-v1.sh -admin

Output:

##################################################################################
Welcome to SQloader Admin-Service installation
##################################################################################
Please Enter JAVA_HOME PATH
/opt/java
##################################################################################
The default PATH to install SQloader Admin Service is /usr/local/sqloader-admin
Do you want to change the default PATH ? (y/N)
##################################################################################
The default PATH to SQloader-Admin logs directory is /var/log/sqloader-admin/logs
Do you want to change the default? (y/N)
##################################################################################
Please enter HZCLUSTERNAME
sqcluster
##################################################################################
SQloader-Admin default port is 7070 , Do you want to change the default port ? (y/N)
##################################################################################
JAVA_HOME=/opt/java
BINDIR=/usr/local/sqloader-admin/
LOG_DIR=/var/log/sqloader-admin/
JAR=sqloader-admin-server-1.0.jar
ADMINPORT=7070
HZCLUSTERNAME=sqcluster
##################################################################################
############# SQLoader-Admin Service installed successfuly #######################
##################################################################################
To Start SQLoader-Admin Service: sudo systemctl start sqloader-admin
To View SQLoader-Admin Service status: sudo systemctl status sqloader-admin
##################################################################################
  1. To start the admin server, run the following command:

sudo systemctl start sqloader-admin
  1. To verify admin server start status, run the following command (optional):

sudo systemctl status sqloader-admin
  1. To install SQLoader service, run the following command (you can install per machine):

sudo ./sqloader-v1.sh -service

Output:

##################################################################################
Welocome to SQloader service installation
##################################################################################
Please Enter JAVA_HOME Path
/opt/java
##################################################################################
The Default PATH to install SQloader Service is /usr/local/sqloader
Do you want to change the default? (y/N)
##################################################################################
The default PATH to SQloader Service logs directory is /var/log/sqloader-service
Do you want to change The default? (y/N)
##################################################################################
Please enter SQloader Admin IP address
192.168.5.234
##################################################################################
Please enter SQloader MEM size in GB
20
##################################################################################
Please enter HZCLUSTERNAME
sqcluster
##################################################################################
Default CONFDIR is /usr/local/sqloader/config , Do you want to change the default CONFDIR ? (y/N)
##################################################################################
Default SQloader Admin port is 7070 , Do you want to change the default port ? (y/N)
##################################################################################
Default SQloader Service port is 6060 , Do you want to change the default port ? (y/N)
##################################################################################
Default sqload-jdbc.properties is /usr/local/sqloader/config, Do you want to change the default? (y/N)
Using default sqload-jdbc.properties PATH
/usr/local/sqloader/config
##################################################################################
##################################################################################
Using /usr/local/sqloader/config/sqload-jdbc.properties
##################################################################################
JAVA_HOME=/opt/java
BINDIR=/usr/local/sqloader/bin
LOG_DIR=/var/log/sqloader-service
CONFDIR=/usr/local/sqloader/config
JAR=sqloader-service-8.2.jar
PROPERTIES_FILE=/usr/local/sqloader/config/sqload-jdbc.properties
PORT=6060
ADMINIP=192.168.5.234
ADMINPORT=7070
MEM=20
HZCLUSTERNAME=sqcluster
##################################################################################
############# SQLoader Service installed successfuly #######################
##################################################################################
To Start SQLoader Service: sudo systemctl start sqloader-service
To View SQLoader Service status: sudo systemctl status sqloader-service
##################################################################################
  1. To start the SQLoader service, run the following command:

sudo systemctl start sqloader-service
  1. To verify SQLoader service start status, run the following command (optional):

sudo systemctl status sqloader-service

Reconfiguration

Admin server

You may reconfigure the admin server even after you have started it.

  1. To get the configuration path, run the following command:

cat /usr/lib/systemd/system/sqloader-admin.service | grep 'EnvironmentFile'

Output:

EnvironmentFile=/usr/local/sqloader-admin/config/sqloader_admin.conf
  1. Restart the admin server:

sudo systemctl restart sqloader-admin

SQLoader service

You may reconfigure the SQLoader service even after you have started it.

  1. To get the configuration path, run the following command:

cat /usr/lib/systemd/system/sqloader-service.service | grep 'EnvironmentFile'

Output:

EnvironmentFile=/usr/local/sqloader/config/sqloader_service.conf
  1. Restart the SQLoader service:

sudo systemctl restart sqloader-service

Connection String

It is recommended that the sqload-jdbc.properties file will contain a connection string.

  1. Open the sqload-jdbc.properties file.

  2. Configure connection parameters for:

    1. The source connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, Sybase or Teradata

    2. The target connection string: SQreamDB

    3. The catalog connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, SQreamDB, Sybase, or Teradata

Connection String Parameters

Parameter

Description

HostIp:port

The host and IP address number

database_name

The name of the database from which data is loaded

user

Username of a role to use for connection

password

Specifies the password of the selected role

ssl

Specifies SSL for this connection

Properties File Sample
 1# Postgresql, Oracle, Teradata, SAP HANA, Microsoft SQL Server, Sybase and SQreamDB Connection Strings
 2# (only one source connection string should be specified)
 3
 4# postgres (and also Greenplum)
 5connectionStringSource=jdbc:postgresql://<HostIp:port>/<database_name>?user=<user_name>&password=<password>&ssl=<true/false>
 6
 7# oracle
 8connectionStringSource=jdbc:oracle:thin:@//<HostIp:port>/<database_name>?user=<user_name>&password=<password>&ssl=<true/false>
 9
10# Oracle Autonomous Database
11
12connectionStringSource=jdbc:oracle:thin:@<database_name>?tns_admin=<path_to_oracle_wallet>&user=<user>&password=<password>
13
14# teradata
15connectionStringSource=jdbc:teradata://<HostIp>/DATABASE=<database_name>,DBS_PORT=<port>,user=<user_name>,password=<password>
16
17# sap hana
18connectionStringSource=jdbc:sap://<HostIp>:<port>/?user=<user_name>&password=<password>
19
20# microsoft sql server
21connectionStringSource=jdbc:sqlserver://<HostIp>:<port>;databaseName=<database_name>;user=<user_name>;password=<password>;encrypt=<true/false>;trustServerCertificate=<true/false>
22
23# sybase
24connectionStringSource=jdbc:sybase:Tds:<HostIp>:<port>/<database_name>?user=<user_name>&password=<password>
25
26# sqream
27connectionStringSqream=jdbc:Sqream://<HostIp:port>/<database_name>;cluster=<true/false>;user=<user_name>;password=<password>
28
29
30
31# Catalog Database Parameters
32
33# Connection string (only one catalog connection string should be specified)
34# Catalog database connection string on Oracle:
35connectionStringCatalog=jdbc:oracle:thin:@//<HostIp:port>/<database_name>?user=<user_name>&password=<password>
36
37# Catalog database connection string on SQreamDB:
38connectionStringCatalog=jdbc:Sqream://<HostIp:port>/<database_name>;cluster=<true/false>;user=<user_name>;password=<password>
39
40
41
42# CDC and Incremental Parameters
43cdcCatalogTable=public.CDC_TABLES
44cdcTrackingTable=public.CDC_TRACKING
45cdcPrimaryKeyTable=public.CDC_TABLE_PRIMARY_KEYS
46
47# Summary table
48loadSummaryTable=public.SQLOAD_SUMMARY
49
50
51
52# OPTIONAL - Data transfer options
53filter=1=1
54count=true
55limit=2000
56threadCount=1
57rowid=false
58batchSize=500
59fetchSize=100000
60chunkSize=0
61caseSensitive=false
62truncate=true
63drop=true
64loadTypeName=full
65cdcDelete=true
66usePartitions=false
67lockCheck=false
68lockTable=true
69loadDttm=false
70useDbmsLob=false
71
72.. more flags

SQLoader Service Interface

The SQLoader service automatically detects the IP addresses of incoming HTTP requests, even if the request originates from the same IP address as the one hosting the service. If you are accessing the service using a proxy server, you can include the client IP address in the request itself by using the X-Forwarded-For HTTP header, as in the following example:

curl -X POST -H 'X-Forwarded-For: 192.168.1.2' -H 'Content-Type: application/json' --data '{"loadTypeName": "inc", "sourceSchema": "QA", "sourceTable": "MY_TABLE", "sqreamTable": "MY_TABLE", "sqreamSchema": "QA"}' http://MyPc:6060/load

Supported HTTP Requests

Request Type

Request Name

cURL Command

Description

Example

POST

load

curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/load

Sends a request to the service and returns immediately. This HTTP request is utilized within a load-balancing queue shared across multiple instances. This setup ensures efficient resource utilization by distributing incoming load requests evenly across all available instances. Additionally, the system incorporates high availability mechanisms to recover failed jobs in case an instance crashes, ensuring continuous operation and reliability even during instance failures. Note that if all instances crash, at least one instance must remain operational to recover and execute pending jobs.

curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/load

POST

syncLoad

curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/syncLoad

Sends a request to the service and returns once the request is complete. There’s no load-balancing queue shared across multiple instances; therefore, it’s advised that syncLoad requests be monitored by the user and not heavily sent. Monitor using the getActiveLoads cURL.

curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/syncLoad

POST

filterLogs

curl --header "Content-Type: application/json" --request POST --data '{"requestId":"", "outputFilePath": ""}' http://127.0.0.1:6060/filterLogs

Retrieves logs for a specific request ID

curl --header "Content-Type: application/json" --request POST --data '{"requestId":"request-1-6a2884a3", "outputFilePath": "/home/avivs/sqloader_request.log"}' http://127.0.0.1:6060/filterLogs

GET

getActiveLoads

curl --header "Content-Type: application/json" --request GET http://127.0.0.1:6060/getActiveLoads

Returns a list of all active loads currently running across all services

GET

cancelRequest

curl --request GET http://127.0.0.1:6061/cancelRequest/<RequestId>

Cancels an active request by request ID

curl --request GET http://127.0.0.1:6061/cancelRequest/request-2-6aa3c53d

High Availability

SQLoader as a service supports high availability for asynchronous load requests only. When a service crashes, another service will take over the tasks and execute them from the beginning. However, there are some limited cases where high availability will not provide coverage:

  • At least one service must remain operational: After a crash, at least one service must be up and running to ensure that tasks can be recovered and executed.

  • Limitations for specific tasks: When any of the following is configured:

    • A task involving a clustered flag must be set to true to enable high availability.

    • A task involving a full load with truncate=false and drop=false will not rerun to prevent data duplication. In this type of load, data is inserted directly into the target table rather than a temporary table, making it impossible to determine if any data was inserted before the crash.

This setup ensures that asynchronous load requests are handled reliably, even in the event of service failures.

Log Rotation

Log rotation is based on time and size. At midnight (00:00) or when the file reaches 100MB, rotation occurs. Rotation means the log file SQLoader_service.log is renamed to SQLoader_service_%d_%i.log (%d=date, %i=rotation number), and a new, empty SQLoader_service.log file is created for the SQLoader service to continue writing to.

Log Automatic cleanup

The maximum number of archived log files to keep is set to 360, so Logback will retain the latest 360 log files in the logs directory. Additionally, the total file size in the directory is limited to 50 GB. If the total size of archived log files exceeds this limit, older log files will be deleted to make room for new ones.

SQLoader Request Parameters

Mandatory flags must be configured using HTTP flags or the properties file.

HTTP Parameter

State

Default

Description

clustered

Optional

true

This flag is relevant only for load requests (async), not for syncLoad. Note that this flag affects high availability. When set to true: the request is directed to one of the available instances within a cluster, often through a load balancer. When set to false: the request goes directly to the specified host without load balancing.

configFile

Optional

sqload-jdbc.properties

Defines the path to the configuration file you wish to use. If not specified, the service will use the default path provided upon service deployment.

connectionStringSqream

Mandatory

JDBC connection string to SQreamDB

connectionStringSource

Mandatory

JDBC connection string to source database

connectionStringCatalog

Mandatory

JDBC connection string to catalog database

cdcCatalogTable

Optional

Part of the schema within the catalog database. Holds all inc/cdc tables and their settings

cdcTrackingTable

Optional

Part of the schema within the catalog database. Holds the last tracking value for every inc/cdc table from cdcCatalogTable table

cdcPrimaryKeyTable

Optional

Part of the schema within the catalog database. Holds all primary keys for every inc/cdc table from cdcCatalogTable table

loadSummaryTable

Mandatory

Part of the schema within the catalog database. Pre-aggregated table that stores summarized loads which can help monitoring and analyzing load

batchSize

Optional

10.000

The number of records to be inserted into SQreamDB at once. Please note that the configured batch size may impact chunk sizes.

caseSensitive

Optional

false

If true, keeps table name uppercase and lowercase characters when table is created in SQreamDB

checkCdcChain

Optional

false

Check CDC chain between tracking table and source table

chunkSize

Optional

0

The number of records read at once from the source database

columnListFilePath

Optional

The name of the file that contains all column names. Columns must be separated using \n. Expected file type is .txt

columns

Optional

All columns

The name or names of columns to be loaded into SQreamDB (“col1,col2, …”). For column names containing uppercase characters, maintain the uppercase format, avoid using double quotes or apostrophes, and ensure that the caseSensitive parameter is set to true

count

Optional

true

Defines whether or not table rows will be counted before being loaded into SQreamDB

cdcDelete

Optional

true

Defines whether or not loading using Change Data Capture (CDC) includes deleted rows

drop

Optional

false

Defines whether or not a new target table in SQreamDB is created. If false, you will need to configure a target table name using the target parameter

fetchSize

Optional

100000

The number of records to be read at once from source database.

filter

Optional

1=1

Defines whether or not only records with SQL conditions are loaded

h, help

Optional

Displays the help menu and exits

limit

Optional

0 (no limit)

Limits the number of rows to be loaded

loadDttm

Optional

true

Add an additional loadDttm column that defines the time and date of loading

loadDttmColumnName

Optional

sq_load_dttm

Specifies the name of the additional column that records the time and date of loading. This parameter works in conjunction with the loadDttm parameter. If loadDttm is enabled, the column defined by loadDttmColumnName will be added to the target table.

loadTypeName

Optional

full

Defines a loading type that affects the table that is created in SQreamDB. Options are full, cdc, or inc. Please note that cdc is supported for Oracle only and that inc is supported for Oracle and Postgresql

lockCheck

Optional

true

Defines whether or not SQLoader will check source table is locked before the loading starts

lockTable

Optional

true

Defines whether or not SQLoader will lock target table before the loading starts

partitionName

Optional

Specifies the number of table partitions. If configured, partition ensures that data is loaded according to the specified partition. You may configure the thread parameter for parallel loading of your table partitions. If you do, please ensure that the number of threads does not exceed the number of partitions.

port

Optional

6060

rowid

Optional

false

Defines whether or not SQLoader will get row IDs from Oracle tables

sourceDatabaseName

Optional

ORCL

Defines the source database name. It does not modify the database connection string but impacts the storage and retrieval of data within catalog tables.

splitByColumn

Optional

Column name for split (required for multi-thread loads)

sourceSchema

Mandatory

Source schema name to load data from

sourceTable

Mandatory

Source table name to load data from

sqreamSchema

Optional

The schema name defined in the sourceSchema flag

Target schema name to load data into

sqreamTable

Optional

The table name defined in the sourceTable flag

Target table name to load data into

threadCount

Optional

1

Number of threads to use for loading. Using multiple threads can significantly improve the loading performance, especially when dealing with columns that have metadata statistics (e.g., min/max values). SQLoader will automatically divide the data into batches based on the specified thread number, allowing for parallel processing. You may use thread both for tables that are partitioned and tables that are not. See Sizing Guidelines

truncate

Optional

false

Truncate target table before loading

typeMappingPath

Optional

config/sqream-mapping.json

A mapping file that converts source data types into SQreamDB data types.

useDbmsLob

Optional

true

Defines whether or not SQLoader uses dbms_lob_substr function for CLOB and BLOB data types

usePartitions

Optional

true

Defines whether or not SQLoader uses partitions in SELECT statements

validateSourceTable

Optional

true

Allows control over the validation of table existence during the load.

Using the loadTypeName Parameter

Using the loadTypeName parameter, you can define how you wish records’ changes to be made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.

Loading Type

Parameter Option

Description

Full Table

full

The entire data of the source table is loaded into SQreamDB

Change Data Capture (CDC)

cdc

Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of INSERT, UPDATE, and DELETE statements. SQLoader recognizes tables by table name and metadata. Supported for Oracle only

Incremental

inc

Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of INSERT statement. SQLoader recognizes the table by table name and metadata. Supported for Oracle and Postgresql

Using the SQLoader Service Web Interface

The SQLoader Admin Server is a web-based administration tool specifically designed to manage and monitor the SQLoader service. It provides a user-friendly interface for monitoring data loading processes, managing configurations, and troubleshooting issues related to data loading into SQreamDB.

SQLoader Service Web Interface Features

  • Monitor Services:

    • Health Checks: Monitor the health status of services to ensure they are functioning properly.

    • Metrics: Monitor real-time performance metrics, including CPU usage, memory usage, and response times.

    • Logging: View logs generated by services for troubleshooting and debugging purposes, and dynamically modify log levels during runtime to adjust verbosity for troubleshooting or performance monitoring.

  • Manage Active Load Requests:

    • View a list of currently active data loading requests, including their status, progress, and relevant metadata.

Creating Summary and Catalog Tables

The summary and catalog tables are pre-aggregated tables that store summarized or aggregated data.

Creating a Summary Table

The summary table is part of the schema within the database catalog.

The following summary table DDL uses Oracle syntax.

Note

If you are migrating from SQLoader as a process to SQLoader as a service, as described on this page, it is highly recommended that you add the following column to your existing summary table instead of re-creating it.

REQUEST_ID TEXT (200 BYTE) VISIBLE DEFAULT NULL
CLIENT_IP TEXT (200 BYTE) VISIBLE DEFAULT NULL
REQUESTED_HOST TEXT (200 BYTE) VISIBLE DEFAULT NULL
ACQUIRED_HOST TEXT (200 BYTE) VISIBLE DEFAULT NULL
CREATE TABLE public.SQLOAD_SUMMARY (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  LOAD_TYPE TEXT(200 BYTE) VISIBLE,
  UPDATED_DTTM_FROM DATE VISIBLE,
  UPDATED_DTTM_TO DATE VISIBLE,
  LAST_VAL_INT NUMBER(22,0) VISIBLE,
  LAST_VAL_TS TIMESTAMP(6) VISIBLE,
  START_TIME TIMESTAMP(6) VISIBLE,
  FINISH_TIME TIMESTAMP(6) VISIBLE,
  ELAPSED_SEC NUMBER VISIBLE,
  ROW_COUNT NUMBER VISIBLE,
  SQL_FILTER TEXT(200 BYTE) VISIBLE,
  PARTITION TEXT(200 BYTE) VISIBLE,
  STMT_TYPE TEXT(200 BYTE) VISIBLE,
  STATUS TEXT(200 BYTE) VISIBLE,
  LOG_FILE TEXT(200 BYTE) VISIBLE,
  DB_URL TEXT(200 BYTE) VISIBLE,
  PARTITION_COUNT NUMBER VISIBLE DEFAULT 0,
  THREAD_COUNT NUMBER VISIBLE DEFAULT 1,
  ELAPSED_MS NUMBER VISIBLE DEFAULT 0,
  STATUS_CODE NUMBER VISIBLE DEFAULT 0,
  ELAPSED_SOURCE_MS NUMBER(38,0) DEFAULT NULL,
  ELAPSED_SOURCE_SEC NUMBER(38,0) DEFAULT NULL,
  ELAPSED_TARGET_MS NUMBER(38,0) DEFAULT NULL,
  ELAPSED_TARGET_SEC NUMBER(38,0) DEFAULT NULL,
  TARGET_DB_URL TEXT (200 BYTE) DEFAULT NULL,
  SQLOADER_VERSION TEXT (200 BYTE) DEFAULT NULL,
  CLIENT_IP TEXT (200 BYTE) DEFAULT NULL,
  REQUESTED_HOST TEXT (200 BYTE) DEFAULT NULL,
  ACQUIRED_HOST TEXT (200 BYTE) DEFAULT NULL,
  REQUEST_ID TEXT (200 BYTE) VISIBLE DEFAULT NULL
);

Creating Catalog Tables

CDC (Change Data Capture) and Incremental tables are database tables that record changes made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.

See Using the loadTypeName Parameter

Change Data Capture (CDC) and Incremental tables are supported only for Oracle.

CREATE TABLE public.CDC_TABLES (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_CDC TEXT(200 BYTE) VISIBLE,
  INC_COLUMN_NAME TEXT(200 BYTE) VISIBLE,
  INC_COLUMN_TYPE TEXT(200 BYTE) VISIBLE,
  LOAD_TYPE TEXT(200 BYTE) VISIBLE,
  FREQ_TYPE TEXT(200 BYTE) VISIBLE,
  FREQ_INTERVAL NUMBER(22,0) VISIBLE,
  IS_ACTIVE NUMBER VISIBLE DEFAULT 0,
  STATUS_LOAD NUMBER VISIBLE DEFAULT 0,
  INC_GAP_VALUE NUMBER VISIBLE DEFAULT 0
);

CREATE TABLE public.CDC_TRACKING (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  LAST_UPDATED_DTTM DATE VISIBLE,
  LAST_VAL_INT NUMBER(22,0) VISIBLE DEFAULT 0,
  LAST_VAL_TS TIMESTAMP(6) VISIBLE,
  LAST_VAL_DT DATE VISIBLE
);

CREATE TABLE public.CDC_TABLE_PRIMARY_KEYS (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  CONSTRAINT_NAME TEXT(200 BYTE) VISIBLE,
  COLUMN_NAME TEXT(200 BYTE) VISIBLE,
  IS_NULLABLE NUMBER VISIBLE DEFAULT 0
);

Data Type Mapping

Automatic Mapping

The SQLoader automatically maps data types used in Greenplum, Microsoft SQL Server, Oracle, Postgresql, Sybase, SAP HANA, and Teradata tables that are loaded into SQreamDB.

Greenplum

Greenplum Type

SQreamDB Type

CHAR, VARCHAR, CHARACTER

TEXT

TEXT

TEXT

INT, SMALLINT, BIGINT, INT2, INT4, INT8

BIGINT

DATETIME, TIMESTAMP

DATETIME

DATE

DATE

BIT, BOOL

BOOL

DECIMAL, NUMERIC

NUMERIC

FLOAT, DOUBLE

DOUBLE

REAL, FLOAT4

REAL

Microsoft SQL Server

Microsoft SQL Server Type

SQreamDB Type

CHAR, NCHAR, VARCHAR, NVARCHAR, NVARCHAR2, CHARACTER, TEXT, NTEXT

TEXT

BIGINT, INT, SMALLINT, INT, TINYINT

BIGINT

DATETIME, TIMESTAMP, SMALLDATETIME, DATETIMEOFFSET, DATETIME2

DATETIME

DATE

DATE

BIT

BOOL

DECIMAL, NUMERIC

NUMERIC

FLOAT, DOUBLE

DOUBLE

REAL

REAL

VARBINARY

TEXT

Oracle

Oracle Type

SQreamDB Type

BIGINT, INT, SMALLINT, INTEGE

BIGINT

CHAR, NCHAR, VARCHAR, VARCHAR2, NVARCHAR, CHARACTER

TEXT

DATE, DATETIME

DATETIME

TIMESTAMP

DATETIME

DATE

DATE

BOOLEAN

BOOL

NUMERIC

NUMERIC

FLOAT, DOUBLE

DOUBLE

CLOB

TEXT

BLOB

TEXT

RAW

TEXT

Postgresql

Postgresql Type

SQreamDB Type

CHAR, VARCHAR, CHARACTER

TEXT

TEXT

TEXT

INT, SMALLINT, BIGINT, INT2, INT4, INT8

BIGINT

DATETIME, TIMESTAMP

DATETIME

DATE

DATE

BIT, BOOL

BOOL

DECIMAL, NUMERIC

NUMERIC

FLOAT, DOUBLE

DOUBLE

REAL, FLOAT4

REAL

SAP HANA

SAP HANA Type

SQreamDB Type

BIGINT, INT, SMALLINT, INTEGER, TINYINT

BIGINT

CHAR, VARCHAR, NVARCHAR, TEXT, VARCHAR2, NVARCHAR2

TEXT

DATETIME, TIMESTAMP, SECONDDATE

DATETIME

DATE

DATE

BOOLEAN

TEXT

DECIMAL, SMALLDECIMAL, BIGDECIMAL

NUMERIC

DOUBLE, REAL

FLOAT

TEXT

TEXT

BIGINT

BIGINT

INT

INT

SMALLINT

SMALLINT

TINYINT

TINYINT

DATETIME

DATETIME

DATE

DATE

BOOL

BOOL

NUMERIC

NUMERIC

DOUBLE

DOUBLE

FLOAT

FLOAT

REAL

REAL

Sybase

Sybase Type

SQreamDB Type

CHAR, VARCHAR, LONG VARCHAR, CHARACTER, TEXT

TEXT

TINYINT

TINYINT

SMALLINT

SMALLINT

INT, INTEGER

INT

BIGINT

BIGINT

DECIMAL, NUMERIC

NUMERIC

NUMERIC(126,38)

NUMERIC(38,10)

FLOAT, DOUBLE

DOUBLE

DATE

DATE

DATETIME, TIMESTAMP, TIME

DATETIME

BIT

BOOL

VARBINARY, BINARY, LONG BINARY

TEXT

Teradata

Teradata Type

SQreamDB Type

F

DOUBLE

N, D

NUMERIC

CO

TEXT

BO

TEXT

A1, AN, AT, BF, BV, CF, CV, JN, PD, PM, PS, PT, PZ, SZ, TZ

TEXT

I, I4, I(4)

INT

I2, I(2)

SMALLINT

I1, I(1)

TINYINT

DH, DM, DS, DY, HM, HS, HR, I8, MO, MS, MI, SC, YM, YR

BIGINT

TS, DATETIME

DATETIME

DA

DATE

BIT

BOOL

REAL, DOUBLE

DOUBLE

Manually Adjusting Mapping

You have the possibility to adjust the mapping process according to your specific needs, using any of the following methods.

names Method

To specify that you want to map one or more columns in your table to a specific data type, duplicate the code block which maps to the SQreamDB data type you want and include the names parameter in your code block. The SQLoader will map the specified columns to the specified SQreamDB data type. After the specified columns are mapped, the SQLoader continue to search for how to convert other data types to the same data type of the specified columns.

In this example, column1, column2, and column3 are mapped to BIGINT and the Oracle data types BIGINT, INT, SMALLINT, INTEGER are also mapped to BIGINT.

{
  "oracle": [
        {
          "names": ["column1", "column2", "column3"],
          "sqream": "bigint",
          "java": "int",
          "length": false
        },
        {
          "type": ["bigint","int","smallint","integer"],
          "sqream": "bigint",
          "java": "int",
          "length": false
        }
}