SQLoader As a Service

The SQLoader is a Java service that enables you to ingest data into SQreamDB from other DBMS and DBaaS through HTTP requests using network insert.

SQLoader supports ingesting data from the following DBMSs:

Greenplum
Microsoft SQL Server
Oracle (including Oracle Autonomous Database)
Postgresql
SAP HANA
Sybase
Teradata
SQreamDB 4.5.15 or later (Full Load only)

Before You Begin 

It is essential that you have the following:

Java 17
SQLoader configuration files
SQLoader.jar file

Minimum Hardware Requirements

Component	Type
CPU cores	16
RAM	32GB

Sizing Guidelines

The SQLoader sizing is determined by the number of concurrent tables and threads based on the available CPU cores, limiting it to the number of cores minus one, with the remaining core reserved for the operating system. Each SQLoader request runs on a single table, meaning concurrent imports of multiple tables require multiple requests. Additionally, it is important to note that for partitioned tables, each partition consumes a thread. Therefore, for performance efficiency, considering the table’s partition count when managing thread allocation is a must.

Compute formula: \(⌊ 0.8 * (TotalMemory - 4) ⌋\)

Installation and Connectivity 

Getting All Configuration and JAR Files

Download the SQLoader zip file:

https://storage.cloud.google.com/cicd-storage/sqloader_release/sqloader-release-v1.1.zip

Extract the .tar file using the following command:

tar -xf sqloader_srv_v8.2.tar.gz

A folder named sqloader with the following files is created:

├── sqloader-v1.sh
├── bin
│   ├── sqloader-admin-server-1.1.jar
│   └── sqloader-service-8.2.jar
├── config
        ├── reserved_words.txt
        ├── sqload-jdbc.properties
        └── sqream-mapping.json

File Name	Description
`sqream-mapping.json`	Maps foreign DBMS and DBaaS data types into SQreamDB data types during ingestion
`sqload-jdbc.properties`	Used for defining a connection string and may also be used to reconfigure data loading
`reserved_words.txt`	A list of reserved words which cannot be used as table and/or column names.
`sqloader-service-8.2.jar`	The SQLoader service JAR file
`sqloader-admin-server-1.0.jar`	The SQLoader admin server JAR file
`sqloader-v1.sh`	SQLoader service installer bash file

Installation

Deployment Parameters

When using the sqloader-v1.sh file (installer), the following flags are already configured.

All deployment flags are not dynamically adjustable at runtime.

Parameter	State	Default	Example	Description
`configDir`	Optional	`config`	`java -jar sqloaderService-8.2.jar --configDir=</path/to/directory/>`	Defines the path to the folder containing both the data type mapping and the reserved words files. The defined folder must contain both files or else you will receive an error. This flag affects the mapping and reserved words files and does not affect the properties file
`hzClusterName=<TEXT>`	Optional		`java -jar sqloader-service-8.2.jar --hzClusterName=<TEXT>`	In Hazelcast, a cluster refers to a group of connected Hazelcast instances across different JVMs or machines. By default, these instances connect to the same cluster on the network level, meaning that all SQLoader services that start on a network will connect to each other and share the same queue. An admin can connect to only one Hazelcast cluster at a time. If you start multiple clusters and want to connect them to the admin service, you will need to start multiple admin services, with each service connecting to one of your clusters. It is essential that this flag has the same name used here and across all SQLoader instances.
`LOG_DIR`	Optional	`logs`	`java -jar -DLOG_DIR=/path/to/log/directory sqloader-service-8.2.jar`	Defines the path of log directory created when loading data. If no value is specified, a `logs` folder is created under the same location as the `sqloader.jar` file
`spring.boot.admin.client.url`	Optional	`http://localhost:7070`	`java -jar sqloader-service-8.2.jar --spring.boot.admin.client.url=http://IP:PORT`	SQLoader admin server connection flag
`Xmx`	Optional		`java -jar -Xmx<number>g sqloader-service-8.2.jar`	We recommend using the `Xmx` flag to set the maximum heap memory allocation for the service. If a single service is running on the machine, we suggest allocating 80% of the total memory minus approximately 4GB, which the service typically needs on average. If multiple services are running on the same machine, calculate the recommended heap size for one service and then divide it by the number of services. Compute formula: \(⌊ 0.8 * (TotalMemory - 4) ⌋\)
`DEFAULT_PROPERTIES`	Mandatory	`sqload-jdbc.properties`	`java -jar -DDEFAULT_PROPERTIES=/path/to/file/sqload-jdbc.properties sqloader-service-8.2.jar`	When the service initializes, it looks for the variable DEFAULT_PROPERTIES, which corresponds to the default sqload-jdbc.properties file. Once the service is running with a specified properties file, this setting will remain unchanged as long as the service is operational. To modify it, you must shut down the service, edit the properties file, and then restart the service. Alternatively, you can modify it via a POST request, but this change will only affect the specific load request and not the default setting for all requests.

Installing the Admin Server and SQLoader Service

To install the admin server, run the following command (install it only once on one machine):

sudo ./sqloader-v1.sh -admin

Output:

##################################################################################
Welcome to SQloader Admin-Service installation
##################################################################################
Please Enter JAVA_HOME PATH
/opt/java
##################################################################################
The default PATH to install SQloader Admin Service is /usr/local/sqloader-admin
Do you want to change the default PATH ? (y/N)
##################################################################################
The default PATH to SQloader-Admin logs directory is /var/log/sqloader-admin/logs
Do you want to change the default? (y/N)
##################################################################################
Please enter HZCLUSTERNAME
sqcluster
##################################################################################
SQloader-Admin default port is 7070 , Do you want to change the default port ? (y/N)
##################################################################################
JAVA_HOME=/opt/java
BINDIR=/usr/local/sqloader-admin/
LOG_DIR=/var/log/sqloader-admin/
JAR=sqloader-admin-server-1.0.jar
ADMINPORT=7070
HZCLUSTERNAME=sqcluster
##################################################################################
############# SQLoader-Admin Service installed successfuly #######################
##################################################################################
To Start SQLoader-Admin Service: sudo systemctl start sqloader-admin
To View SQLoader-Admin Service status: sudo systemctl status sqloader-admin
##################################################################################

To start the admin server, run the following command:

sudo systemctl start sqloader-admin

To verify admin server start status, run the following command (optional):

sudo systemctl status sqloader-admin

To install SQLoader service, run the following command (you can install per machine):

sudo ./sqloader-v1.sh -service

Output:

##################################################################################
Welocome to SQloader service installation
##################################################################################
Please Enter JAVA_HOME Path
/opt/java
##################################################################################
The Default PATH to install SQloader Service is /usr/local/sqloader
Do you want to change the default? (y/N)
##################################################################################
The default PATH to SQloader Service logs directory is /var/log/sqloader-service
Do you want to change The default? (y/N)
##################################################################################
Please enter SQloader Admin IP address
192.168.5.234
##################################################################################
Please enter SQloader MEM size in GB
20
##################################################################################
Please enter HZCLUSTERNAME
sqcluster
##################################################################################
Default CONFDIR is /usr/local/sqloader/config , Do you want to change the default CONFDIR ? (y/N)
##################################################################################
Default SQloader Admin port is 7070 , Do you want to change the default port ? (y/N)
##################################################################################
Default SQloader Service port is 6060 , Do you want to change the default port ? (y/N)
##################################################################################
Default sqload-jdbc.properties is /usr/local/sqloader/config, Do you want to change the default? (y/N)
Using default sqload-jdbc.properties PATH
/usr/local/sqloader/config
##################################################################################
##################################################################################
Using /usr/local/sqloader/config/sqload-jdbc.properties
##################################################################################
JAVA_HOME=/opt/java
BINDIR=/usr/local/sqloader/bin
LOG_DIR=/var/log/sqloader-service
CONFDIR=/usr/local/sqloader/config
JAR=sqloader-service-8.2.jar
PROPERTIES_FILE=/usr/local/sqloader/config/sqload-jdbc.properties
PORT=6060
ADMINIP=192.168.5.234
ADMINPORT=7070
MEM=20
HZCLUSTERNAME=sqcluster
##################################################################################
############# SQLoader Service installed successfuly #######################
##################################################################################
To Start SQLoader Service: sudo systemctl start sqloader-service
To View SQLoader Service status: sudo systemctl status sqloader-service
##################################################################################

To start the SQLoader service, run the following command:

sudo systemctl start sqloader-service

To verify SQLoader service start status, run the following command (optional):

sudo systemctl status sqloader-service

Reconfiguration

Admin server

You may reconfigure the admin server even after you have started it.

To get the configuration path, run the following command:

cat /usr/lib/systemd/system/sqloader-admin.service | grep 'EnvironmentFile'

Output:

EnvironmentFile=/usr/local/sqloader-admin/config/sqloader_admin.conf

Restart the admin server:

sudo systemctl restart sqloader-admin

SQLoader service

You may reconfigure the SQLoader service even after you have started it.

To get the configuration path, run the following command:

cat /usr/lib/systemd/system/sqloader-service.service | grep 'EnvironmentFile'

Output:

EnvironmentFile=/usr/local/sqloader/config/sqloader_service.conf

Restart the SQLoader service:

sudo systemctl restart sqloader-service

Connection String

It is recommended that the sqload-jdbc.properties file will contain a connection string.

Open the sqload-jdbc.properties file.
Configure connection parameters for:
1. The source connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, Sybase or Teradata
2. The target connection string: SQreamDB
3. The catalog connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, SQreamDB, Sybase, or Teradata

Connection String Parameters
Parameter	Description
`HostIp:port`	The host and IP address number
`database_name`	The name of the database from which data is loaded
`user`	Username of a role to use for connection
`password`	Specifies the password of the selected role
`ssl`	Specifies SSL for this connection

Properties File Sample

# Postgresql, Oracle, Teradata, SAP HANA, Microsoft SQL Server, Sybase and SQreamDB Connection Strings
# (only one source connection string should be specified)

# postgres (and also Greenplum)
connectionStringSource=jdbc:postgresql://<HostIp:port>/<database_name>?user=<user_name>&password=<password>&ssl=<true/false>

# oracle
connectionStringSource=jdbc:oracle:thin:@//<HostIp:port>/<database_name>?user=<user_name>&password=<password>&ssl=<true/false>

# Oracle Autonomous Database

connectionStringSource=jdbc:oracle:thin:@<database_name>?tns_admin=<path_to_oracle_wallet>&user=<user>&password=<password>

# teradata
connectionStringSource=jdbc:teradata://<HostIp>/DATABASE=<database_name>,DBS_PORT=<port>,user=<user_name>,password=<password>

# sap hana
connectionStringSource=jdbc:sap://<HostIp>:<port>/?user=<user_name>&password=<password>

# microsoft sql server
connectionStringSource=jdbc:sqlserver://<HostIp>:<port>;databaseName=<database_name>;user=<user_name>;password=<password>;encrypt=<true/false>;trustServerCertificate=<true/false>

# sybase
connectionStringSource=jdbc:sybase:Tds:<HostIp>:<port>/<database_name>?user=<user_name>&password=<password>

# sqream
connectionStringSqream=jdbc:Sqream://<HostIp:port>/<database_name>;cluster=<true/false>;user=<user_name>;password=<password>



# Catalog Database Parameters

# Connection string (only one catalog connection string should be specified)
# Catalog database connection string on Oracle:
connectionStringCatalog=jdbc:oracle:thin:@//<HostIp:port>/<database_name>?user=<user_name>&password=<password>

# Catalog database connection string on SQreamDB:
connectionStringCatalog=jdbc:Sqream://<HostIp:port>/<database_name>;cluster=<true/false>;user=<user_name>;password=<password>



# CDC and Incremental Parameters
cdcCatalogTable=public.CDC_TABLES
cdcTrackingTable=public.CDC_TRACKING
cdcPrimaryKeyTable=public.CDC_TABLE_PRIMARY_KEYS

# Summary table
loadSummaryTable=public.SQLOAD_SUMMARY



# OPTIONAL - Data transfer options
filter=1=1
count=true
limit=2000
threadCount=1
rowid=false
batchSize=500
fetchSize=100000
chunkSize=0
caseSensitive=false
truncate=true
drop=true
loadTypeName=full
cdcDelete=true
usePartitions=false
lockCheck=false
lockTable=true
loadDttm=false
useDbmsLob=false

.. more flags

SQLoader Service Interface 

The SQLoader service automatically detects the IP addresses of incoming HTTP requests, even if the request originates from the same IP address as the one hosting the service. If you are accessing the service using a proxy server, you can include the client IP address in the request itself by using the X-Forwarded-For HTTP header, as in the following example:

curl -X POST -H 'X-Forwarded-For: 192.168.1.2' -H 'Content-Type: application/json' --data '{"loadTypeName": "inc", "sourceSchema": "QA", "sourceTable": "MY_TABLE", "sqreamTable": "MY_TABLE", "sqreamSchema": "QA"}' http://MyPc:6060/load

Supported HTTP Requests

Request Type	Request Name	cURL Command	Description	Example
POST	`load`	`curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/load`	Sends a request to the service and returns immediately. This HTTP request is utilized within a load-balancing queue shared across multiple instances. This setup ensures efficient resource utilization by distributing incoming load requests evenly across all available instances. Additionally, the system incorporates high availability mechanisms to recover failed jobs in case an instance crashes, ensuring continuous operation and reliability even during instance failures. Note that if all instances crash, at least one instance must remain operational to recover and execute pending jobs.	`curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/load`
POST	`syncLoad`	`curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/syncLoad`	Sends a request to the service and returns once the request is complete. There’s no load-balancing queue shared across multiple instances; therefore, it’s advised that `syncLoad` requests be monitored by the user and not heavily sent. Monitor using the `getActiveLoads` cURL.	`curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/syncLoad`
POST	`filterLogs`	`curl --header "Content-Type: application/json" --request POST --data '{"requestId":"", "outputFilePath": ""}' http://127.0.0.1:6060/filterLogs`	Retrieves logs for a specific request ID	`curl --header "Content-Type: application/json" --request POST --data '{"requestId":"request-1-6a2884a3", "outputFilePath": "/home/avivs/sqloader_request.log"}' http://127.0.0.1:6060/filterLogs`
GET	`getActiveLoads`	`curl --header "Content-Type: application/json" --request GET http://127.0.0.1:6060/getActiveLoads`	Returns a list of all active loads currently running across all services
GET	`cancelRequest`	`curl --request GET http://127.0.0.1:6061/cancelRequest/<RequestId>`	Cancels an active request by request ID	`curl --request GET http://127.0.0.1:6061/cancelRequest/request-2-6aa3c53d`

High Availability

SQLoader as a service supports high availability for asynchronous load requests only. When a service crashes, another service will take over the tasks and execute them from the beginning. However, there are some limited cases where high availability will not provide coverage:

At least one service must remain operational: After a crash, at least one service must be up and running to ensure that tasks can be recovered and executed.
Limitations for specific tasks: When any of the following is configured:
- A task involving a clustered flag must be set to true to enable high availability.
- A task involving a full load with truncate=false and drop=false will not rerun to prevent data duplication. In this type of load, data is inserted directly into the target table rather than a temporary table, making it impossible to determine if any data was inserted before the crash.

This setup ensures that asynchronous load requests are handled reliably, even in the event of service failures.

Log Rotation

Log rotation is based on time and size. At midnight (00:00) or when the file reaches 100MB, rotation occurs. Rotation means the log file SQLoader_service.log is renamed to SQLoader_service_%d_%i.log (%d=date, %i=rotation number), and a new, empty SQLoader_service.log file is created for the SQLoader service to continue writing to.

Log Automatic cleanup

The maximum number of archived log files to keep is set to 360, so Logback will retain the latest 360 log files in the logs directory. Additionally, the total file size in the directory is limited to 50 GB. If the total size of archived log files exceeds this limit, older log files will be deleted to make room for new ones.

SQLoader Request Parameters

Mandatory flags must be configured using HTTP flags or the properties file.

HTTP Parameter	State	Default	Description
`clustered`	Optional	`true`	This flag is relevant only for `load` requests (`async`), not for `syncLoad`. Note that this flag affects high availability. When set to `true`: the request is directed to one of the available instances within a cluster, often through a load balancer. When set to `false`: the request goes directly to the specified host without load balancing.
`configFile`	Optional	`sqload-jdbc.properties`	Defines the path to the configuration file you wish to use. If not specified, the service will use the default path provided upon service deployment.
`connectionStringSqream`	Mandatory		JDBC connection string to SQreamDB
`connectionStringSource`	Mandatory		JDBC connection string to source database
`connectionStringCatalog`	Mandatory		JDBC connection string to catalog database
`cdcCatalogTable`	Optional		Part of the schema within the catalog database. Holds all inc/cdc tables and their settings
`cdcTrackingTable`	Optional		Part of the schema within the catalog database. Holds the last tracking value for every inc/cdc table from `cdcCatalogTable` table
`cdcPrimaryKeyTable`	Optional		Part of the schema within the catalog database. Holds all primary keys for every inc/cdc table from `cdcCatalogTable` table
`loadSummaryTable`	Mandatory		Part of the schema within the catalog database. Pre-aggregated table that stores summarized loads which can help monitoring and analyzing load
`batchSize`	Optional	`10.000`	The number of records to be inserted into SQreamDB at once. Please note that the configured batch size may impact chunk sizes.
`caseSensitive`	Optional	`false`	If `true`, keeps table name uppercase and lowercase characters when table is created in SQreamDB
`checkCdcChain`	Optional	`false`	Check CDC chain between tracking table and source table
`chunkSize`	Optional	`0`	The number of records read at once from the source database
`columnListFilePath`	Optional		The name of the file that contains all column names. Columns must be separated using `\n`. Expected file type is `.txt`
`columns`	Optional	All columns	The name or names of columns to be loaded into SQreamDB (“col1,col2, …”). For column names containing uppercase characters, maintain the uppercase format, avoid using double quotes or apostrophes, and ensure that the `caseSensitive` parameter is set to true
`count`	Optional	`true`	Defines whether or not table rows will be counted before being loaded into SQreamDB
`cdcDelete`	Optional	`true`	Defines whether or not loading using Change Data Capture (CDC) includes deleted rows
`drop`	Optional	`false`	Defines whether or not a new target table in SQreamDB is created. If `false`, you will need to configure a target table name using the `target` parameter
`fetchSize`	Optional	`100000`	The number of records to be read at once from source database.
`filter`	Optional	`1=1`	Defines whether or not only records with SQL conditions are loaded
`h, help`	Optional		Displays the help menu and exits
`limit`	Optional	`0` (no limit)	Limits the number of rows to be loaded
`loadDttm`	Optional	`true`	Add an additional `loadDttm` column that defines the time and date of loading
`loadDttmColumnName`	Optional	`sq_load_dttm`	Specifies the name of the additional column that records the time and date of loading. This parameter works in conjunction with the `loadDttm` parameter. If `loadDttm` is enabled, the column defined by `loadDttmColumnName` will be added to the target table.
`loadTypeName`	Optional	`full`	Defines a loading type that affects the table that is created in SQreamDB. Options are `full`, `cdc`, or `inc`. Please note that `cdc` is supported for Oracle only and that `inc` is supported for Oracle and Postgresql
`lockCheck`	Optional	`true`	Defines whether or not SQLoader will check source table is locked before the loading starts
`lockTable`	Optional	`true`	Defines whether or not SQLoader will lock target table before the loading starts
`partitionName`	Optional		Specifies the number of table partitions. If configured, `partition` ensures that data is loaded according to the specified partition. You may configure the `thread` parameter for parallel loading of your table partitions. If you do, please ensure that the number of threads does not exceed the number of partitions.
`port`	Optional	`6060`
`rowid`	Optional	`false`	Defines whether or not SQLoader will get row IDs from Oracle tables
`sourceDatabaseName`	Optional	`ORCL`	Defines the source database name. It does not modify the database connection string but impacts the storage and retrieval of data within catalog tables.
`splitByColumn`	Optional		Column name for split (required for multi-thread loads)
`sourceSchema`	Mandatory		Source schema name to load data from
`sourceTable`	Mandatory		Source table name to load data from
`sqreamSchema`	Optional	The schema name defined in the `sourceSchema` flag	Target schema name to load data into
`sqreamTable`	Optional	The table name defined in the `sourceTable` flag	Target table name to load data into
`threadCount`	Optional	`1`	Number of threads to use for loading. Using multiple threads can significantly improve the loading performance, especially when dealing with columns that have metadata statistics (e.g., min/max values). SQLoader will automatically divide the data into batches based on the specified thread number, allowing for parallel processing. You may use `thread` both for tables that are partitioned and tables that are not. See Sizing Guidelines
`truncate`	Optional	`false`	Truncate target table before loading
`typeMappingPath`	Optional	`config/sqream-mapping.json`	A mapping file that converts source data types into SQreamDB data types.
`useDbmsLob`	Optional	`true`	Defines whether or not SQLoader uses `dbms_lob_substr` function for `CLOB` and `BLOB` data types
`usePartitions`	Optional	`true`	Defines whether or not SQLoader uses partitions in `SELECT` statements
`validateSourceTable`	Optional	`true`	Allows control over the validation of table existence during the load.

Using the `loadTypeName` Parameter

Using the loadTypeName parameter, you can define how you wish records’ changes to be made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.

Loading Type	Parameter Option	Description
Full Table	`full`	The entire data of the source table is loaded into SQreamDB
Change Data Capture (CDC)	`cdc`	Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of `INSERT`, `UPDATE`, and `DELETE` statements. SQLoader recognizes tables by table name and metadata. Supported for Oracle only
Incremental	`inc`	Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of `INSERT` statement. SQLoader recognizes the table by table name and metadata. Supported for Oracle and Postgresql

Using the SQLoader Service Web Interface

The SQLoader Admin Server is a web-based administration tool specifically designed to manage and monitor the SQLoader service. It provides a user-friendly interface for monitoring data loading processes, managing configurations, and troubleshooting issues related to data loading into SQreamDB.

SQLoader Service Web Interface Features

Monitor Services:
- Health Checks: Monitor the health status of services to ensure they are functioning properly.
- Metrics: Monitor real-time performance metrics, including CPU usage, memory usage, and response times.
- Logging: View logs generated by services for troubleshooting and debugging purposes, and dynamically modify log levels during runtime to adjust verbosity for troubleshooting or performance monitoring.
Manage Active Load Requests:
- View a list of currently active data loading requests, including their status, progress, and relevant metadata.

Creating Summary and Catalog Tables 

The summary and catalog tables are pre-aggregated tables that store summarized or aggregated data.

Creating a Summary Table

The summary table is part of the schema within the database catalog.

The following summary table DDL uses Oracle syntax.

Note

If you are migrating from SQLoader as a process to SQLoader as a service, as described on this page, it is highly recommended that you add the following column to your existing summary table instead of re-creating it.

REQUEST_ID TEXT (200 BYTE) VISIBLE DEFAULT NULL
CLIENT_IP TEXT (200 BYTE) VISIBLE DEFAULT NULL
REQUESTED_HOST TEXT (200 BYTE) VISIBLE DEFAULT NULL
ACQUIRED_HOST TEXT (200 BYTE) VISIBLE DEFAULT NULL

CREATE TABLE public.SQLOAD_SUMMARY (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  LOAD_TYPE TEXT(200 BYTE) VISIBLE,
  UPDATED_DTTM_FROM DATE VISIBLE,
  UPDATED_DTTM_TO DATE VISIBLE,
  LAST_VAL_INT NUMBER(22,0) VISIBLE,
  LAST_VAL_TS TIMESTAMP(6) VISIBLE,
  START_TIME TIMESTAMP(6) VISIBLE,
  FINISH_TIME TIMESTAMP(6) VISIBLE,
  ELAPSED_SEC NUMBER VISIBLE,
  ROW_COUNT NUMBER VISIBLE,
  SQL_FILTER TEXT(200 BYTE) VISIBLE,
  PARTITION TEXT(200 BYTE) VISIBLE,
  STMT_TYPE TEXT(200 BYTE) VISIBLE,
  STATUS TEXT(200 BYTE) VISIBLE,
  LOG_FILE TEXT(200 BYTE) VISIBLE,
  DB_URL TEXT(200 BYTE) VISIBLE,
  PARTITION_COUNT NUMBER VISIBLE DEFAULT 0,
  THREAD_COUNT NUMBER VISIBLE DEFAULT 1,
  ELAPSED_MS NUMBER VISIBLE DEFAULT 0,
  STATUS_CODE NUMBER VISIBLE DEFAULT 0,
  ELAPSED_SOURCE_MS NUMBER(38,0) DEFAULT NULL,
  ELAPSED_SOURCE_SEC NUMBER(38,0) DEFAULT NULL,
  ELAPSED_TARGET_MS NUMBER(38,0) DEFAULT NULL,
  ELAPSED_TARGET_SEC NUMBER(38,0) DEFAULT NULL,
  TARGET_DB_URL TEXT (200 BYTE) DEFAULT NULL,
  SQLOADER_VERSION TEXT (200 BYTE) DEFAULT NULL,
  CLIENT_IP TEXT (200 BYTE) DEFAULT NULL,
  REQUESTED_HOST TEXT (200 BYTE) DEFAULT NULL,
  ACQUIRED_HOST TEXT (200 BYTE) DEFAULT NULL,
  REQUEST_ID TEXT (200 BYTE) VISIBLE DEFAULT NULL
);

Creating Catalog Tables

CDC (Change Data Capture) and Incremental tables are database tables that record changes made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.

See Using the loadTypeName Parameter

Change Data Capture (CDC) and Incremental tables are supported only for Oracle.

CREATE TABLE public.CDC_TABLES (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_CDC TEXT(200 BYTE) VISIBLE,
  INC_COLUMN_NAME TEXT(200 BYTE) VISIBLE,
  INC_COLUMN_TYPE TEXT(200 BYTE) VISIBLE,
  LOAD_TYPE TEXT(200 BYTE) VISIBLE,
  FREQ_TYPE TEXT(200 BYTE) VISIBLE,
  FREQ_INTERVAL NUMBER(22,0) VISIBLE,
  IS_ACTIVE NUMBER VISIBLE DEFAULT 0,
  STATUS_LOAD NUMBER VISIBLE DEFAULT 0,
  INC_GAP_VALUE NUMBER VISIBLE DEFAULT 0
);

CREATE TABLE public.CDC_TRACKING (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  LAST_UPDATED_DTTM DATE VISIBLE,
  LAST_VAL_INT NUMBER(22,0) VISIBLE DEFAULT 0,
  LAST_VAL_TS TIMESTAMP(6) VISIBLE,
  LAST_VAL_DT DATE VISIBLE
);

CREATE TABLE public.CDC_TABLE_PRIMARY_KEYS (
  DB_NAME TEXT(200 BYTE) VISIBLE,
  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME TEXT(200 BYTE) VISIBLE,
  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
  CONSTRAINT_NAME TEXT(200 BYTE) VISIBLE,
  COLUMN_NAME TEXT(200 BYTE) VISIBLE,
  IS_NULLABLE NUMBER VISIBLE DEFAULT 0
);

Data Type Mapping 

Automatic Mapping 

The SQLoader automatically maps data types used in Greenplum, Microsoft SQL Server, Oracle, Postgresql, Sybase, SAP HANA, and Teradata tables that are loaded into SQreamDB.

Greenplum

Greenplum Type	SQreamDB Type
`CHAR`, `VARCHAR`, `CHARACTER`	`TEXT`
`TEXT`	`TEXT`
`INT`, `SMALLINT`, `BIGINT`, `INT2`, `INT4`, `INT8`	`BIGINT`
`DATETIME`, `TIMESTAMP`	`DATETIME`
`DATE`	`DATE`
`BIT`, `BOOL`	`BOOL`
`DECIMAL`, `NUMERIC`	`NUMERIC`
`FLOAT`, `DOUBLE`	`DOUBLE`
`REAL`, `FLOAT4`	`REAL`

Microsoft SQL Server

Microsoft SQL Server Type	SQreamDB Type
`CHAR`, `NCHAR`, `VARCHAR`, `NVARCHAR`, `NVARCHAR2`, `CHARACTER`, `TEXT`, `NTEXT`	`TEXT`
`BIGINT`, `INT`, `SMALLINT`, `INT`, `TINYINT`	`BIGINT`
`DATETIME`, `TIMESTAMP`, `SMALLDATETIME`, `DATETIMEOFFSET`, `DATETIME2`	`DATETIME`
`DATE`	`DATE`
`BIT`	`BOOL`
`DECIMAL`, `NUMERIC`	`NUMERIC`
`FLOAT`, `DOUBLE`	`DOUBLE`
`REAL`	`REAL`
`VARBINARY`	`TEXT`

Oracle

Oracle Type	SQreamDB Type
`BIGINT`, `INT`, `SMALLINT`, `INTEGE`	`BIGINT`
`CHAR`, `NCHAR`, `VARCHAR`, `VARCHAR2`, `NVARCHAR`, `CHARACTER`	`TEXT`
`DATE`, `DATETIME`	`DATETIME`
`TIMESTAMP`	`DATETIME`
`DATE`	`DATE`
`BOOLEAN`	`BOOL`
`NUMERIC`	`NUMERIC`
`FLOAT`, `DOUBLE`	`DOUBLE`
`CLOB`	`TEXT`
`BLOB`	`TEXT`
`RAW`	`TEXT`

Postgresql

Postgresql Type	SQreamDB Type
`CHAR`, `VARCHAR`, `CHARACTER`	`TEXT`
`TEXT`	`TEXT`
`INT`, `SMALLINT`, `BIGINT`, `INT2`, `INT4`, `INT8`	`BIGINT`
`DATETIME`, `TIMESTAMP`	`DATETIME`
`DATE`	`DATE`
`BIT`, `BOOL`	`BOOL`
`DECIMAL`, `NUMERIC`	`NUMERIC`
`FLOAT`, `DOUBLE`	`DOUBLE`
`REAL`, `FLOAT4`	`REAL`

SAP HANA

SAP HANA Type	SQreamDB Type
`BIGINT`, `INT`, `SMALLINT`, `INTEGER`, `TINYINT`	`BIGINT`
`CHAR`, `VARCHAR`, `NVARCHAR`, `TEXT`, `VARCHAR2`, `NVARCHAR2`	`TEXT`
`DATETIME`, `TIMESTAMP`, `SECONDDATE`	`DATETIME`
`DATE`	`DATE`
`BOOLEAN`	`TEXT`
`DECIMAL`, `SMALLDECIMAL`, `BIGDECIMAL`	`NUMERIC`
`DOUBLE`, `REAL`	`FLOAT`
`TEXT`	`TEXT`
`BIGINT`	`BIGINT`
`INT`	`INT`
`SMALLINT`	`SMALLINT`
`TINYINT`	`TINYINT`
`DATETIME`	`DATETIME`
`DATE`	`DATE`
`BOOL`	`BOOL`
`NUMERIC`	`NUMERIC`
`DOUBLE`	`DOUBLE`
`FLOAT`	`FLOAT`
`REAL`	`REAL`

Sybase

Sybase Type	SQreamDB Type
`CHAR`, `VARCHAR`, `LONG VARCHAR`, `CHARACTER`, `TEXT`	`TEXT`
`TINYINT`	`TINYINT`
`SMALLINT`	`SMALLINT`
`INT`, `INTEGER`	`INT`
`BIGINT`	`BIGINT`
`DECIMAL`, `NUMERIC`	`NUMERIC`
`NUMERIC(126,38)`	`NUMERIC(38,10)`
`FLOAT`, `DOUBLE`	`DOUBLE`
`DATE`	`DATE`
`DATETIME`, `TIMESTAMP`, `TIME`	`DATETIME`
`BIT`	`BOOL`
`VARBINARY`, `BINARY`, `LONG BINARY`	`TEXT`

Teradata

Teradata Type	SQreamDB Type
`F`	`DOUBLE`
`N`, `D`	`NUMERIC`
`CO`	`TEXT`
`BO`	`TEXT`
`A1`, `AN`, `AT`, `BF`, `BV`, `CF`, `CV`, `JN`, `PD`, `PM`, `PS`, `PT`, `PZ`, `SZ`, `TZ`	`TEXT`
`I`, `I4`, `I(4)`	`INT`
`I2`, `I(2)`	`SMALLINT`
`I1`, `I(1)`	`TINYINT`
`DH`, `DM`, `DS`, `DY`, `HM`, `HS`, `HR`, `I8`, `MO`, `MS`, `MI`, `SC`, `YM`, `YR`	`BIGINT`
`TS`, `DATETIME`	`DATETIME`
`DA`	`DATE`
`BIT`	`BOOL`
`REAL`, `DOUBLE`	`DOUBLE`

Manually Adjusting Mapping 

You have the possibility to adjust the mapping process according to your specific needs, using any of the following methods.

`names` Method

To specify that you want to map one or more columns in your table to a specific data type, duplicate the code block which maps to the SQreamDB data type you want and include the names parameter in your code block. The SQLoader will map the specified columns to the specified SQreamDB data type. After the specified columns are mapped, the SQLoader continue to search for how to convert other data types to the same data type of the specified columns.

In this example, column1, column2, and column3 are mapped to BIGINT and the Oracle data types BIGINT, INT, SMALLINT, INTEGER are also mapped to BIGINT.

{
  "oracle": [
        {
          "names": ["column1", "column2", "column3"],
          "sqream": "bigint",
          "java": "int",
          "length": false
        },
        {
          "type": ["bigint","int","smallint","integer"],
          "sqream": "bigint",
          "java": "int",
          "length": false
        }
}

SQLoader As a Service

Before You Begin

Minimum Hardware Requirements

Sizing Guidelines

Installation and Connectivity

Getting All Configuration and JAR Files

Installation

Deployment Parameters

Installing the Admin Server and SQLoader Service

Reconfiguration

Connection String

SQLoader Service Interface

Supported HTTP Requests

High Availability

Log Rotation

Log Automatic cleanup

SQLoader Request Parameters

Using the loadTypeName Parameter

Using the SQLoader Service Web Interface

SQLoader Service Web Interface Features

Creating Summary and Catalog Tables

Creating a Summary Table

Creating Catalog Tables

Data Type Mapping

Automatic Mapping

Greenplum

Microsoft SQL Server

Oracle

Postgresql

SAP HANA

Sybase

Teradata

Manually Adjusting Mapping

names Method

Before You Begin 

Installation and Connectivity 

SQLoader Service Interface 

Using the `loadTypeName` Parameter

Creating Summary and Catalog Tables 

Data Type Mapping 

Automatic Mapping 

Manually Adjusting Mapping 

`names` Method