Migrating from Oracle¶
This guide covers actions required for migrating from Oracle to SQream DB with CSV files.
In this topic:
- 1. Preparing the tools and login information
- 2. Export the desired schema
- 3. Convert the Oracle dump to standard SQL
- 4. Figure out the database structures
- 5. Create the tables in SQream DB
- 6. Export tables to CSVs
- 7. Place CSVs where SQream DB workers can access
- 8. Bulk load the CSVs
- 9. Rewrite Oracle queries
Use the Data Pump Export utility to export the database schema.
The format for using the Export utility is
expdp <user>/<password> DIRECTORY=<directory> DUMPFILE=<dump file> CONTENT=metadata_only NOLOGFILE
The resulting Oracle-only schema is stored in a dump file.
$ expdp rhendricks/secretpassword DIRECTORY=dpumpdir DUMPFILE=tables.dmp CONTENT=metadata_only NOLOGFILE
Oracle’s Data Pump Import utility will help us convert the dump from the previous step to standard SQL.
The format for using the Import utility is
impdp <user>/<password> DIRECTORY=<directory> DUMPFILE=<dump file> SQLFILE=<sql file> TRANSFORM=SEGMENT_ATTRIBUTES:N:table PARTITION_OPTIONS=MERGE
TRANSFORM=SEGMENT_ATTRIBUTES:N:tableexcludes segment attributes (both STORAGE and TABLESPACE) from the tables
PARTITON_OPTIONS=MERGEcombines all partitions and subpartitions into one table.
Using the SQL file created in the previous step, write CREATE TABLE statements to match the schemas of the tables.
Trim unsupported primary keys, indexes, constraints, and other unsupported Oracle attributes.
Refer to the table below to match the Oracle source data type to a new SQream DB type:
|Oracle Data type||Precision||SQream DB data type|
||p <= 63||
||p > 63||
||p < 5||
||p < 9||
||p < 19||
||p >= 20||
||f > 0||
Read more about supported data types in SQream DB.
After rewriting the table strucutres, create them in SQream DB.
HR.EMPLOYEES sample table:
CREATE TABLE employees ( employee_id NUMBER(6) , first_name VARCHAR2(20) , last_name VARCHAR2(25) CONSTRAINT emp_last_name_nn NOT NULL , email VARCHAR2(25) CONSTRAINT emp_email_nn NOT NULL , phone_number VARCHAR2(20) , hire_date DATE CONSTRAINT emp_hire_date_nn NOT NULL , job_id VARCHAR2(10) CONSTRAINT emp_job_nn NOT NULL , salary NUMBER(8,2) , commission_pct NUMBER(2,2) , manager_id NUMBER(6) , department_id NUMBER(4) , CONSTRAINT emp_salary_min CHECK (salary > 0) , CONSTRAINT emp_email_uk UNIQUE (email) ) ; CREATE UNIQUE INDEX emp_emp_id_pk ON employees (employee_id) ; ALTER TABLE employees ADD ( CONSTRAINT emp_emp_id_pk PRIMARY KEY (employee_id) , CONSTRAINT emp_dept_fk FOREIGN KEY (department_id) REFERENCES departments , CONSTRAINT emp_job_fk FOREIGN KEY (job_id) REFERENCES jobs (job_id) , CONSTRAINT emp_manager_fk FOREIGN KEY (manager_id) REFERENCES employees ) ;
This table rewritten for SQream DB would be created like this:
CREATE TABLE employees ( employee_id SMALLINT NOT NULL, first_name VARCHAR(20), last_name VARCHAR(25) NOT NULL, email VARCHAR(20) NOT NULL, phone_number VARCHAR(20), hire_date DATE NOT NULL, job_id VARCHAR(10) NOT NULL, salary FLOAT, commission_pct REAL, manager_id SMALLINT, department_id TINYINT );
Exporting CSVs from Oracle servers is not a trivial task.
Options for exporting to CSVs
Here’s a sample SQL*Plus script that will export PSVs in a format that SQream DB can read:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
SET ECHO OFF SET TERMOUT OFF SET FEEDBACK OFF SET WRAP OFF SET PAGESIZE 0 SET LINESIZE 10000 SET ARRAYSIZE 5000 SET NUMW 30 SET TRIMSPOOL ON SET UNDERLINE OFF SET RECSEP OFF SET VERIFY OFF SET COLSEP '|' SPOOL '&1' ALTER SESSION SET nls_date_format = 'YYYY-MM-DD HH24:MI:SS'; SELECT * FROM &1; SPOOL OFF
Enter SQL*Plus and export tables one-by-one interactively:
$ sqlplus rhendricks/secretpassword @spool employees @spool jobs [...] EXIT
Each table is exported as a data list file (
You can use stored procedures if you have them set-up.
Examples of stored procedures for generating CSVs <https://asktom.oracle.com/pls/apex/asktom.search?tag=automate-the-generation-of-sql-query-output-to-csv>`_ can be found in the Ask The Oracle Mentors forums.
Files should be a valid CSV. By default, SQream DB’s CSV parser can handle RFC 4180 standard CSVs , but can also be modified to support non-standard CSVs (with multi-character delimiters, unquoted fields, etc).
Files are UTF-8 or ASCII encoded
Field delimiter is an ASCII character or characters
Record delimiter, also known as a new line separator, is a Unix-style newline (
\n), DOS-style newline (
\r\n), or Mac style newline (
Fields are optionally enclosed by double-quotes, or mandatory quoted if they contain one of the following characters:
- The record delimiter or field delimiter
- A double quote character
- A newline
If a field is quoted, any double quote that appears must be double-quoted (similar to the string literals quoting rules. For example, to encode
What are "birds"?, the field should appear as
"What are ""birds""?".
Other modes of escaping are not supported (e.g.
1,"What are \"birds\"?"is not a valid way of escaping CSV values).
NULLvalues can be marked in two ways in the CSV:
- An explicit null marker. For example,
- An empty field delimited by the field delimiter. For example,
If a text field is quoted but contains no content (
"") it is considered an empty text field. It is not considered
- An explicit null marker. For example,
During data load, the COPY FROM command can run on any worker (unless explicitly speficied with the Workload manager). It is important that every node has the same view of the storage being used - meaning, every SQream DB worker should have access to the files.
- For files hosted on NFS, ensure that the mount is accessible from all servers.
- For HDFS, ensure that SQream DB servers can access the HDFS name node with the correct user-id
- For S3, ensure network access to the S3 endpoint
Issue the COPY FROM commands to SQream DB to insert a table from the CSVs created.
COPY FROM command for each table exported from Oracle.
SQream DB supports a large subset of ANSI SQL.
You will have to refactor much of Oracle’s SQL and functions that often are not ANSI SQL.
We recommend the following resources:
- SQL Feature Checklist - to understand SQream DB’s SQL feature support.
- Optimization and best practices - to understand best practices for SQL queries and schema design.
- Common table expressions (CTEs) - CTEs can be used to rewrite complex queries in a compact form.
- Concurrency and locks - to understand the difference between Oracle’s transactions and SQream DB’s concurrency.
- Identity (sequence) - SQream DB supports sequences, but no triggers for auto-increment.
- Joins - SQream DB supports ANSI join syntax. Oracle uses the
+operator which SQream DB doesn’t support.
- Saved queries - Saved queries can be used to emulate some stored procedures.
- Subqueries - SQream DB supports a limited set of subqueries.
- Python UDF (user-defined functions) - SQream DB supports Python User Defined Functions which can be used to run complex operations in-line.
- Views - SQream DB supports logical views, but does not support materialized views.
- Window Functions - SQream DB supports a wide array of window functions.