COPY TO

The COPY TO statement is used for exporting data from a SQream database table or for exporting query results to a file on the filesystem. You may wish to export data from SQream for any of the following reasons:

  • To use data in external tables. See Working with External Data.

  • To share data with other clients or consumers using different systems.

  • To copy data into another SQream cluster.

In general, COPY moves data between filesystem files and SQream DB tables. If you wish to copy data from a file to a table, see COPY FROM.

Syntax

The following is the correct syntax for using the COPY TO statement:

 COPY { [schema_name].table_name [ ( column_name [, ... ] ) ] | query }
   TO [FOREIGN DATA] WRAPPER fdw_name

     OPTIONS
     (
        [ copy_to_option [, ...] ]
     )
     ;

 fdw_name ::= csv_fdw | parquet_fdw | orc_fdw

 schema_name ::= identifer

 table_name ::= identifier

 copy_to_option ::=

    LOCATION = { filename | S3 URI | HDFS URI }

    | DELIMITER = '{ delimiter }'

    | RECORD_DELIMITER = '{ record delimiter }'

    | HEADER = { true | false }

    | AWS_ID = '{ AWS ID }'

    | AWS_SECRET = '{ AWS Secret }'

    | MAX_FILE_SIZE = '{ size_in_bytes }'

    | ENFORCE_SINGLE_FILE = { true | false }


delimiter ::= string

record delimiter ::= string

AWS ID ::= string

AWS Secret ::= string

Note

In Studio, you must write the parameters using lower case letters. Using upper case letters generates an error.

Elements

The following table shows the COPY_TO elements:

Parameter

Description

[schema_name].table_name

Name of the table to be exported

query

An SQL query that returns a table result, or a table name

fdw_name

The name of the Foreign Data Wrapper to use. Supported FDWs are csv_fdw, orc_fdw, avro_fdw or parquet_fdw.

LOCATION

A path on the local filesystem, S3, or HDFS URI. For example, /tmp/foo.csv, s3://my-bucket/foo.csv, or hdfs://my-namenode:8020/foo.csv. The local path must be an absolute path that SQream DB can access.

HEADER

The CSV file will contain a header line with the names of each column in the file. This option is allowed only when using CSV format.

DELIMITER

Specifies the character or string that separates fields (columns) within each row of the file. The default is a comma character (,). This option is allowed only when using CSV format.

RECORD_DELIMITER

Specifies the character or string that separates records in a data set. This option is allowed only when using CSV format.

AWS_ID, AWS_SECRET

Specifies the authentication details for secured S3 buckets

MAX_FILE_SIZE

Sets the maximum file size (bytes). Default value: 16*2^20 (16MB).

ENFORCE_SINGLE_FILE

Enforces the maximum file size (bytes). Permitted values: true - creates one file of unlimited size, false - permits creating several files together limited by the MAX_FILE_SIZE. When set to true, the single file size is not limited by the MAX_FILE_SIZE setting. When set to false, the combined file sizes cannot exceed the MAX_FILE_SIZE. Default value: TRUE.

Usage Notes

Supported Field Delimiters

Printable ASCII Characters

Any printable ASCII character can be used as a delimiter without special syntax. The default CSV field delimiter is a comma (,).

The following table shows the supported printable ASCII characters:

Character

Description

ASCII

Octal

Hex

Binary

HTML Code

HTML Name

(Space)

Space

32

40

20

100000

 

!

Exclamation Mark

33

41

21

100001

!

!

#

Hash or Number

35

43

23

100011

#

#

$

Dollar Sign

36

44

24

100100

$

$

%

Percentage

37

45

25

100101

%

%

&

Ampersand

38

46

26

100110

&

&

(

Left Parenthesis

40

50

28

101000

(

(

)

Right Parenthesis

41

51

29

101001

)

)

*

Asterisk

42

52

2A

101010

*

*

+

Plus Sign

43

53

2B

101011

+

+

,

Comma

44

54

2C

101100

,

,

/

Slash

47

57

2F

101111

/

/

;

Semicolon

59

73

3B

111011

&#59;

;

<

Less Than

60

74

3C

111100

&#60;

&lt;

=

Equals Sign

61

75

3D

111101

&#61;

&equals;

>

Greater Than

62

76

3E

111110

&#62;

&gt;

?

Question Mark

63

77

3F

111111

&#63;

&quest;

@

At Sign

64

100

40

1000000

&#64;

&commat;

[

Left Square Bracket

91

133

5B

1011011

&#91;

&lsqb;

\

Backslash

92

134

5C

1011100

&#92;

&bsol;

]

Right Square Bracket

93

135

5D

1011101

&#93;

&rsqb;

^

Caret or Circumflex

94

136

5E

1011110

&#94;

&hat;

_

Underscore

95

137

5F

1011111

&#95;

&lowbar;

`

Grave Accent

96

140

60

1100000

`

&grave;

{

Left Curly Bracket

123

173

7B

1111011

&#123;

&lcub;

|

Vertical Bar

124

174

7C

1111100

&#124;

&verbar;

}

Right Curly Bracket

125

175

7D

1111101

&#125;

&rcub;

~

Tilde

126

176

7E

1111110

&#126;

&tilde;

Non-Printable ASCII Characters

The following table shows the supported non-printable ASCII characters:

Character

Description

Octal

ASCII

Hex

Binary

HTML Code

HTML Name

NUL

Null

0

0

0

0

&#0;

SOH

Start of Heading

1

1

1

1

&#1;

STX

Start of Text

2

2

2

10

&#2;

ETX

End of Text

3

3

3

11

&#3;

EOT

End of Transmission

4

4

4

100

&#4;

ENQ

Enquiry

5

5

5

101

&#5;

ACK

Acknowledge

6

6

6

110

&#6;

BEL

Bell

7

7

7

111

&#7;

BS

Backspace

10

8

8

1000

&#8;

HT

Horizontal Tab

11

9

9

1001

&#9;

VT

Vertical Tab

13

11

0B

1011

&#11;

FF

NP Form Feed, New Page

14

12

0C

1100

&#12;

SO

Shift Out

16

14

0E

1110

&#14;

SI

Shift In

17

15

0F

1111

&#15;

DLE

Data Link Escape

20

16

10

10000

&#16;

DC1

Device Control 1

21

17

11

10001

&#17;

DC2

Device Control 2

22

18

12

10010

&#18;

DC3

Device Control 3

23

19

13

10011

&#19;

DC4

Device Control 4

24

20

14

10100

&#20;

NAK

Negative Acknowledge

25

21

15

10101

&#21;

SYN

Synchronous Idle

26

22

16

10110

&#22;

ETB

End of Transmission Block

27

23

17

10111

&#23;

CAN

Cancel

30

24

18

11000

&#24;

EM

End of Medium

31

25

19

11001

&#25;

SUB

Substitute

32

26

1A

11010

&#26;

ESC

Escape

33

27

1B

11011

&#27;

FS

File Separator

34

28

1C

11100

&#28;

GS

Group Separator

35

29

1D

11101

&#29;

RS

Record Separator

36

30

1E

11110

&#30;

US

Unit Separator

37

31

1F

11111

&#31;

DEL

Delete

177

127

7F

1111111

&#127;

A tab can be specified by escaping it, for example \t. Other non-printable characters can be specified using their octal representations, by using the E'\000' format, where 000 is the octal value of the character.

For example, ASCII character 15, known as “shift in”, can be specified using E'\017'.

Note

Delimiters are only applicable to the CSV file format.

Unsupported ASCII Field Delimiters

The following table shows the unsupported ASCII field delimiters:

ASCII

Character

Description

Octal

Hex

Binary

HTML Code

HTML Name

10

LF

NL Line Feed, New Line

12

0A

1010

&#10;

13

CR

Carriage Return

15

0D

1101

&#13;

34

Double Quote

42

22

100010

"

&quot;

45

-

Minus Sign

55

2D

101101

-

&minus;

46

.

Period

56

2E

101110

.

&period;

48

0

Zero

60

30

110000

&#48;

49

1

Number One

61

31

110001

&#49;

50

2

Number Two

62

32

110010

&#50;

51

3

Number Three

63

33

110011

&#51;

52

4

Number Four

64

34

110100

&#52;

53

5

Number Five

65

35

110101

&#53;

54

6

Number Six

66

36

110110

&#54;

55

7

Number Seven

67

37

110111

&#55;

56

8

Number Eight

70

38

111000

&#56;

57

9

Number Nine

71

39

111001

&#57;

58

:

Colon

72

3A

111010

&#58;

&colon;

65

A

Upper Case Letter A

101

41

1000001

&#65;

66

B

Upper Case Letter B

102

42

1000010

&#66;

67

C

Upper Case Letter C

103

43

1000011

&#67;

68

D

Upper Case Letter D

104

44

1000100

&#68;

69

E

Upper Case Letter E

105

45

1000101

&#69;

70

F

Upper Case Letter F

106

46

1000110

&#70;

71

G

Upper Case Letter G

107

47

1000111

&#71;

72

H

Upper Case Letter H

110

48

1001000

&#72;

73

I

Upper Case Letter I

111

49

1001001

&#73;

74

J

Upper Case Letter J

112

4A

1001010

&#74;

75

K

Upper Case Letter K

113

4B

1001011

&#75;

76

L

Upper Case Letter L

114

4C

1001100

&#76;

77

M

Upper Case Letter M

115

4D

1001101

&#77;

78

N

Upper Case Letter N

116

4E

1001110

&#78;

79

O

Upper Case Letter O

117

4F

1001111

&#79;

80

P

Upper Case Letter P

120

50

1010000

&#80;

81

Q

Upper Case Letter Q

121

51

1010001

&#81;

82

R

Upper Case Letter R

122

52

1010010

&#82;

83

S

Upper Case Letter S

123

53

1010011

&#83;

84

T

Upper Case Letter T

124

54

1010100

&#84;

85

U

Upper Case Letter U

125

55

1010101

&#85;

86

V

Upper Case Letter V

126

56

1010110

&#86;

87

W

Upper Case Letter W

127

57

1010111

&#87;

88

X

Upper Case Letter X

130

58

1011000

&#88;

89

Y

Upper Case Letter Y

131

59

1011001

&#89;

90

Z

Upper Case Letter Z

132

5A

1011010

&#90;

92

\

Backslash

134

5C

01011100

&#92;

97

a

Lower Case Letter a

141

61

1100001

&#97;

98

b

Lower Case Letter b

142

62

1100010

&#98;

99

c

Lower Case Letter c

143

63

1100011

&#99;

100

d

Lower Case Letter d

144

64

1100100

&#100;

101

e

Lower Case Letter e

145

65

1100101

&#101;

102

f

Lower Case Letter f

146

66

1100110

&#102;

103

g

Lower Case Letter g

147

67

1100111

&#103;

104

h

Lower Case Letter h

150

68

1101000

&#104;

105

i

Lower Case Letter i

151

69

1101001

&#105;

106

j

Lower Case Letter j

152

6A

1101010

&#106;

107

k

Lower Case Letter k

153

6B

1101011

&#107;

108

l

Lower Case Letter l

154

6C

1101100

&#108;

109

m

Lower Case Letter m

155

6D

1101101

&#109;

110

n

Lower Case Letter n

156

6E

1101110

&#110;

111

o

Lower Case Letter o

157

6F

1101111

&#111;

112

p

Lower Case Letter p

160

70

1110000

&#112;

113

q

Lower Case Letter q

161

71

1110001

&#113;

114

r

Lower Case Letter r

162

72

1110010

&#114;

115

s

Lower Case Letter s

163

73

1110011

&#115;

116

t

Lower Case Letter t

164

74

1110100

&#116;

117

u

Lower Case Letter u

165

75

1110101

&#117;

118

v

Lower Case Letter v

166

76

1110110

&#118;

119

w

Lower Case Letter w

167

77

1110111

&#119;

120

x

Lower Case Letter x

170

78

1111000

&#120;

121

y

Lower Case Letter y

171

79

1111001

&#121;

122

z

Lower Case Letter z

172

7A

1111010

&#122;

Date Format

The date format in the output CSV is formatted as ISO 8601 (2019-12-31 20:30:55.123), regardless of how it was parsed initially with COPY FROM date parsers.

For more information on the datetime format, see Date.

Examples

Exporting Data From SQream to External File Tables

Parquet

The compression algorithm used for exporting data from SQream to Parquet files is Snappy.

Exporting tables to Parquet files:

COPY nba TO WRAPPER parquet_fdw OPTIONS (LOCATION = '/tmp/nba_export.parquet');

Exporting query results to Parquet files:

COPY (SELECT name FROM nba WHERE salary<1148640) TO WRAPPER parquet_fdw OPTIONS (LOCATION = '/tmp/file.parquet');

ORC

The compression algorithm used for exporting data from SQream to ORC files is ZLIB.

Exporting tables to ORC files:

COPY nba TO WRAPPER orc_fdw OPTIONS (LOCATION = '/tmp/nba_export.orc');

Exporting query results to ORC files:

COPY (SELECT name from nba where salary<1148640) TO WRAPPER orc_fdw OPTIONS (LOCATION = '/tmp/file.orc');

AVRO

The compression algorithm used for exporting data from SQream to Parquet files is Snappy.

Exporting tables to AVRO files:

COPY nba TO WRAPPER avro_fdw OPTIONS (LOCATION = '/tmp/nba_export.avro');

Exporting query results to AVRO files:

COPY (SELECT name from nba where salary<1148640) TO WRAPPER avro_fdw OPTIONS (LOCATION = '/tmp/file.avro');

CSV

Exporting a table to a CSV file without a HEADER row:

COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = ',', HEADER = false);
$ head -n6 nba.csv
Avery Bradley,Boston Celtics,0,PG,25,6-2,180,Texas,7730337
Jae Crowder,Boston Celtics,99,SF,25,6-6,235,Marquette,6796117
John Holland,Boston Celtics,30,SG,27,6-5,205,Boston University,\N
R.J. Hunter,Boston Celtics,28,SG,22,6-5,185,Georgia State,1148640
Jonas Jerebko,Boston Celtics,8,PF,29,6-10,231,\N,5000000
Amir Johnson,Boston Celtics,90,PF,29,6-9,240,\N,12000000

Exporting a table to a CSV file with a HEADER row:

COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = ',', HEADER = true);
$ head -n6 nba_h.csv
Name,Team,Number,Position,Age,Height,Weight,College,Salary
Avery Bradley,Boston Celtics,0,PG,25,6-2,180,Texas,7730337
Jae Crowder,Boston Celtics,99,SF,25,6-6,235,Marquette,6796117
John Holland,Boston Celtics,30,SG,27,6-5,205,Boston University,\N
R.J. Hunter,Boston Celtics,28,SG,22,6-5,185,Georgia State,1148640
Jonas Jerebko,Boston Celtics,8,PF,29,6-10,231,\N,5000000

Exporting the result of a query to a CSV file:

COPY (SELECT Team, AVG(Salary) FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv');
$ head -n5 nba_salaries.csv
Atlanta Hawks,4860196
Boston Celtics,4181504
Brooklyn Nets,3501898
Charlotte Hornets,5222728
Chicago Bulls,5785558

TSV

Exporting a table to a TSV file with a HEADER row:

COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = '\t', HEADER = true);
$ head -n6 nba_h.tsv
Name    Team    Number  Position        Age     Height  Weight  College Salary
Avery Bradley   Boston Celtics  0       PG      25      6-2     180     Texas  7730337
Jae Crowder     Boston Celtics  99      SF      25      6-6     235     Marquette       6796117
John Holland    Boston Celtics  30      SG      27      6-5     205     Boston University       \N
R.J. Hunter     Boston Celtics  28      SG      22      6-5     185     Georgia State   1148640
Jonas Jerebko   Boston Celtics  8       PF      29      6-10    231     \N     5000000

Exporting Data From SQream to Cloud Storage

The following is an example of saving files to an authenticated S3 bucket:

COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = 's3://my_bucket/salaries/nba_export.csv', AWS_ID = 'my_aws_id', AWS_SECRET = 'my_aws_secret');

The following is an example of saving files to an HDFS path:

COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = 'hdfs://pp_namenode:8020/nba_export.csv');

Using Non-Printable ASCII Characters as Delimiters

The following is an example of using non-printable ASCII characters as delimiters:

Non-printable characters can be specified using their octal representations, by using the E'\000' format, where 000 is the octal value of the character.

For example, ASCII character 15, known as “shift in”, can be specified using E'\017'.

COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = E'\017');
COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = E'\011'); -- 011 is a tab character

Using the MAX_FILE_SIZE and ENFORCE_SINGLE_FILE parameters:

COPY nba TO WRAPPER csv_fdw OPTIONS(
        max_file_size = '250000000',
        enforce_single_file = 'true',
        location = '/tmp/nba_export.parquet'
);

Permissions

The role must have the SELECT permission on every table or schema that is referenced by the statement.