:orphan: .. _copy_from: ********* COPY FROM ********* ``COPY ... FROM`` is a statement that allows loading data from external files and importing them into BLUE. This is the recommended way for bulk loading files into BLUE. Syntax ====== .. code-block:: postgres COPY [ "". ]"" [ () [ ,...] ] FROM WRAPPER OPTIONS ( [ [, ...] ] ) copy_from_option ::= LOCATION = { GCP URI | S3 URI | HDFS URI } | QUOTE = {'C' | E'\ooo') | OFFSET = { offset } | LIMIT = { limit } | DELIMITER = '{ delimiter }' | RECORD_DELIMITER = '{ record delimiter }' | ERROR_LOG = '{ local filepath }' | REJECTED_DATA = '{ local filepath }' | CONTINUE_ON_ERROR = { true | false } | ERROR_COUNT = '{ error count }' | DATETIME_FORMAT = '{ parser format }' | AWS_ID = '{ AWS ID }' | AWS_SECRET = '{ AWS Secret }' | DELETE_SOURCE_ON_SUCCESS = { true | false } offset ::= positive integer limit ::= positive integer delimiter ::= string record delimiter ::= string error count ::= integer parser_format ::= see supported parser table below AWS ID ::= string AWS Secret ::= string .. _copy_from_config_options: Elements ======== .. list-table:: :widths: auto :header-rows: 1 * - Parameter - Default value - Value range - Description * - ``[schema_name.]table_name`` - None - - Table to copy data into * - ``QUOTE`` - ``"`` - - Specifies an alternative quote character. The quote character must be a single, 1-byte printable ASCII character, and the equivalent octal syntax of the copy command can be used. The quote character cannot be contained in the field delimiter, the record delimiter, or the null marker. ``QUOTE`` can be used with ``csv_fdw`` in ``COPY FROM`` and foreign tables. The following characters cannot be an alternative quote character: ``"-.:\\0123456789abcdefghijklmnopqrstuvwxyzN"`` * - ``fdw_name`` - - ``csv_fdw``, ``orc_fdw``, ``parquet_fdw``, ``json_fdw``, or ``avro_fdw`` - The name of the Foreign Data Wrapper to use * - ``LOCATION`` - None - - A path to S3, or HDFS URI. For example: ``s3://my-bucket/foo.csv``, or ``hdfs://my-namenode:8020/foo.csv``. Wildcards are premitted in this field. * - ``OFFSET`` - ``1`` - >1, but no more than the number of lines in the first file. - The row number to start with. The first row is ``1``. Applicable to CSVs only * - ``LIMIT`` - unlimited - 1 to 2147483647. - When specified, tells Blue to stop loading after the specified number of rows. Unlimited if unset. Applicable to CSVs only * - ``DELIMITER`` - ``','`` - Almost any ASCII character, :ref:`See field delimiters section below` - Specifies the field terminator - the character (or characters) that separates fields or columns within each row of the file. Applicable to CSVs only * - ``RECORD_DELIMITER`` - ``\n`` (UNIX style newline) - ``\n``, ``\r\n``, ``\r`` - Specifies the row terminator - the character that separates lines or rows, also known as a new line separator. Applicable to CSVs only * - ``ERROR_LOG`` - No error log - - When used, the ``COPY`` process will write error information from unparsable rows to the file specified by this parameter. ``ERROR_LOG`` requires ``CONTINUE_ON_ERROR`` to be set to ``true`` * If an existing file path is specified, it will be overwritten. * Specifying the same file for ``ERROR_LOG`` and ``REJECTED_DATA`` is not allowed and will result in error. * Specifing an error log when creating a foreign table will write a new error log for every query on the foreign table. * Applicable to CSVs only * - ``REJECTED_DATA`` - Inactive - - When used, the ``COPY`` process will write the rejected record lines to this file. * If an existing file path is specified, it will be overwritten. * Specifying the same file for ``ERROR_LOG`` and ``REJECTED_DATA`` is not allowed and will result in error. * Specifing an error log when creating a foreign table will write a new error log for every query on the foreign table. * - ``CONTINUE_ON_ERROR`` - ``false`` - ``true`` | ``false`` - Specifies if errors should be ignored or skipped. When set to ``true``, the transaction will continue despite rejected data. This parameter should be set together with ``ERROR_COUNT`` When reading multiple files, if an entire file can't be opened it will be skipped. * - ``ERROR_COUNT`` - ``unlimited`` - 1 to 2147483647 - Specifies the threshold for the maximum number of faulty records that will be ignored. This setting must be used in conjunction with ``CONTINUE_ON_ERROR``. * - ``DATETIME_FORMAT`` - ISO8601 for all columns - :ref:`See table below` - Allows specifying a non-default date formats for specific columns. Applicable to CSVs only * - ``AWS_ID``, ``AWS_SECRET`` - None - - Specifies the authentication details for secured S3 buckets * - ``DELETE_SOURCE_ON_SUCCESS`` - ``false`` - ``true`` | ``false`` - When set to ``true``, the source file or files associated with the target path will be deleted after a successful completion of the ``COPY FROM`` operation. File deletion will not occur in the case of unsuccessful ``COPY FROM`` operations, such as when a user lacks delete permissions on their operating system. It's important to note that this parameter cannot be used concurrently with the ``OFFSET``, ``ERROR_LOG``, ``REJECTED_DATA``, ``ERROR_COUNT``, and ``LIMIT`` parameters. This parameter is supported for S3, HDFS, and GCP Object Storage. .. _field_delimiters: Field Delimiters ================ Field delimiters can be one or more characters. Printable ASCII Characters -------------------------- All printable ASCII character (except for ``N``) can be used as a delimiter without special syntax. The default CSV field delimiter is a comma (``,``). A printable character is any ASCII character in the range 32 - 126. The following table shows the supported printable ASCII characters: +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | **Character** | **Description** | **ASCII** | **Octal** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | (Space) | Space | 32 | 40 | 20 | 100000 | | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ! | Exclamation Mark | 33 | 41 | 21 | 100001 | ! | ! | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | # | Hash or Number | 35 | 43 | 23 | 100011 | # | # | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | $ | Dollar Sign | 36 | 44 | 24 | 100100 | $ | $ | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | % | Percentage | 37 | 45 | 25 | 100101 | % | % | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | & | Ampersand | 38 | 46 | 26 | 100110 | & | & | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ( | Left Parenthesis | 40 | 50 | 28 | 101000 | ( | ( | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ) | Right Parenthesis | 41 | 51 | 29 | 101001 | ) | ) | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | \*\ | Asterisk | 42 | 52 | 2A | 101010 | * | * | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | \+\ | Plus Sign | 43 | 53 | 2B | 101011 | + | + | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | , | Comma | 44 | 54 | 2C | 101100 | , | , | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | / | Slash | 47 | 57 | 2F | 101111 | / | / | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ; | Semicolon | 59 | 73 | 3B | 111011 | ; | ; | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | < | Less Than | 60 | 74 | 3C | 111100 | < | < | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | = | Equals Sign | 61 | 75 | 3D | 111101 | = | = | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | > | Greater Than | 62 | 76 | 3E | 111110 | > | > | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ? | Question Mark | 63 | 77 | 3F | 111111 | ? | ? | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | @ | At Sign | 64 | 100 | 40 | 1000000 | @ | @ | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | [ | Left Square Bracket | 91 | 133 | 5B | 1011011 | [ | [ | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | \\ | Backslash | 92 | 134 | 5C | 1011100 | \&\#92\; | \ | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ] | Right Square Bracket | 93 | 135 | 5D | 1011101 | ] | ] | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ^ | Caret or Circumflex | 94 | 136 | 5E | 1011110 | ^ | &hat; | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | _ | Underscore | 95 | 137 | 5F | 1011111 | _ | _ | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | \` | Grave Accent | 96 | 140 | 60 | 1100000 | \&\#96\; | ` | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | { | Left Curly Bracket | 123 | 173 | 7B | 1111011 | { | { | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | \|\ | Vertical Bar | 124 | 174 | 7C | 1111100 | | | | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | } | Right Curly Bracket | 125 | 175 | 7D | 1111101 | } | } | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | ~ | Tilde | 126 | 176 | 7E | 1111110 | ~ | ˜ | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 58 | : | Colon | 72 | 3A | 111010 | : | : | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 65 | A | A | 101 | 41 | 1000001 | A | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 66 | B | B | 102 | 42 | 1000010 | B | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 67 | C | C | 103 | 43 | 1000011 | C | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 68 | D | D | 104 | 44 | 1000100 | D | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 69 | E | E | 105 | 45 | 1000101 | E | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 70 | F | F | 106 | 46 | 1000110 | F | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 71 | G | G | 107 | 47 | 1000111 | G | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 72 | H | H | 110 | 48 | 1001000 | H | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 73 | I | I | 111 | 49 | 1001001 | I | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 74 | J | J | 112 | 4A | 1001010 | J | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 75 | K | K | 113 | 4B | 1001011 | K | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 76 | L | L | 114 | 4C | 1001100 | L | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 77 | M | M | 115 | 4D | 1001101 | M | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 79 | O | O | 117 | 4F | 1001111 | O | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 80 | P | P | 120 | 50 | 1010000 | P | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 81 | Q | Q | 121 | 51 | 1010001 | Q | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 82 | R | R | 122 | 52 | 1010010 | R | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 83 | S | S | 123 | 53 | 1010011 | S | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 84 | T | T | 124 | 54 | 1010100 | T | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 85 | U | U | 125 | 55 | 1010101 | U | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 86 | V | V | 126 | 56 | 1010110 | V | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 87 | W | W | 127 | 57 | 1010111 | W | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 88 | X | X | 130 | 58 | 1011000 | X | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 89 | Y | Y | 131 | 59 | 1011001 | Y | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 90 | Z | Z | 132 | 5A | 1011010 | Z | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 92 | \\ | Backslash | 134 | 5C | 01011100 | \&\#92\; | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 97 | a | a | 141 | 61 | 1100001 | a | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 98 | b | b | 142 | 62 | 1100010 | b | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 99 | c | c | 143 | 63 | 1100011 | c | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 100 | d | d | 144 | 64 | 1100100 | d | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 101 | e | e | 145 | 65 | 1100101 | e | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 102 | f | f | 146 | 66 | 1100110 | f | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 103 | g | g | 147 | 67 | 1100111 | g | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 104 | h | h | 150 | 68 | 1101000 | h | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 105 | i | i | 151 | 69 | 1101001 | i | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 106 | j | j | 152 | 6A | 1101010 | j | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 107 | k | k | 153 | 6B | 1101011 | k | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 108 | l | l | 154 | 6C | 1101100 | l | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 109 | m | m | 155 | 6D | 1101101 | m | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 110 | n | n | 156 | 6E | 1101110 | n | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 111 | o | o | 157 | 6F | 1101111 | o | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 112 | p | p | 160 | 70 | 1110000 | p | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 113 | q | q | 161 | 71 | 1110001 | q | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 114 | r | r | 162 | 72 | 1110010 | r | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 115 | s | s | 163 | 73 | 1110011 | s | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 116 | t | t | 164 | 74 | 1110100 | t | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 117 | u | u | 165 | 75 | 1110101 | u | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 118 | v | v | 166 | 76 | 1110110 | v | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 119 | w | w | 167 | 77 | 1110111 | w | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 120 | x | x | 170 | 78 | 1111000 | x | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 121 | y | y | 171 | 79 | 1111001 | y | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ | 122 | z | z | 172 | 7A | 1111010 | z | | +---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ Non-Printable ASCII Characters ------------------------------ A tab can be specified by escaping it, for example ``\t``. Other non-printable characters can be specified using their octal representations, by using the ``E'\000'`` format, where ``000`` is the octal value of the character. For example, ASCII character ``15``, known as "shift in", can be specified using ``E'\017'``. The following table shows the supported non-printable ASCII characters: +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | **Character** | **Description** | **Octal** | **ASCII** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | NUL | Null | 0 | 0 | 0 | 0 | � | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | SOH | Start of Heading | 1 | 1 | 1 | 1 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | STX | Start of Text | 2 | 2 | 2 | 10 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | ETX | End of Text | 3 | 3 | 3 | 11 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | EOT | End of Transmission | 4 | 4 | 4 | 100 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | ENQ | Enquiry | 5 | 5 | 5 | 101 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | ACK | Acknowledge | 6 | 6 | 6 | 110 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | BEL | Bell | 7 | 7 | 7 | 111 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | BS | Backspace | 10 | 8 | 8 | 1000 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | HT | Horizontal Tab | 11 | 9 | 9 | 1001 | | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | VT | Vertical Tab | 13 | 11 | 0B | 1011 | | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | FF | NP Form Feed, New Page | 14 | 12 | 0C | 1100 | | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | SO | Shift Out | 16 | 14 | 0E | 1110 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | SI | Shift In | 17 | 15 | 0F | 1111 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | DLE | Data Link Escape | 20 | 16 | 10 | 10000 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | DC1 | Device Control 1 | 21 | 17 | 11 | 10001 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | DC2 | Device Control 2 | 22 | 18 | 12 | 10010 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | DC3 | Device Control 3 | 23 | 19 | 13 | 10011 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | DC4 | Device Control 4 | 24 | 20 | 14 | 10100 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | NAK | Negative Acknowledge | 25 | 21 | 15 | 10101 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | SYN | Synchronous Idle | 26 | 22 | 16 | 10110 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | ETB | End of Transmission Block | 27 | 23 | 17 | 10111 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | CAN | Cancel | 30 | 24 | 18 | 11000 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | EM | End of Medium | 31 | 25 | 19 | 11001 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | SUB | Substitute | 32 | 26 | 1A | 11010 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | ESC | Escape | 33 | 27 | 1B | 11011 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | FS | File Separator | 34 | 28 | 1C | 11100 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | GS | Group Separator | 35 | 29 | 1D | 11101 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | RS | Record Separator | 36 | 30 | 1E | 11110 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | US | Unit Separator | 37 | 31 | 1F | 11111 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ | DEL | Delete | 177 | 127 | 7F | 1111111 |  | | +---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ .. _capturing_rejected_rows: Unsupported ASCII Field Delimiters ---------------------------------- The following table shows the unsupported ASCII field delimiters: +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | **ASCII** | **Character** | **Description** | **Octal** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 10 | LF | NL Line Feed, New Line | 12 | 0A | 1010 | | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 13 | CR | Carriage Return | 15 | 0D | 1101 | | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 34 | " | Double Quote | 42 | 22 | 100010 | " | " | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 45 | \-\ | Minus Sign | 55 | 2D | 101101 | - | − | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 46 | . | Period | 56 | 2E | 101110 | . | . | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 48 | 0 | Zero | 60 | 30 | 110000 | 0 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 49 | 1 | Number One | 61 | 31 | 110001 | 1 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 50 | 2 | Number Two | 62 | 32 | 110010 | 2 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 51 | 3 | Number Three | 63 | 33 | 110011 | 3 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 52 | 4 | Number Four | 64 | 34 | 110100 | 4 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 53 | 5 | Number Five | 65 | 35 | 110101 | 5 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 54 | 6 | Number Six | 66 | 36 | 110110 | 6 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 55 | 7 | Number Seven | 67 | 37 | 110111 | 7 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 56 | 8 | Number Eight | 70 | 38 | 111000 | 8 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 57 | 9 | Number Nine | 71 | 39 | 111001 | 9 | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 58 | : | Colon | 72 | 3A | 111010 | : | : | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 92 | \\ | Backslash | 134 | 5C | 01011100 | \&\#92\; | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ | 78 | N | N | 116 | 4E | 1001110 | N | | +-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ Capturing Rejected Rows ======================= Prior to the column process and storage, the ``COPY`` command parses the data. Whenever the data can’t be parsed because it is improperly formatted or doesn’t match the data structure, the entire record (or row) will be rejected. When ``ERROR_LOG`` is not used, the ``COPY`` command will stop and roll back the transaction upon the first error. .. image:: /_static/images/copy_from_rejected_rows.png :width: 50% CSV Support =========== By default, Blue's CSV parser can handle `RFC 4180 standard CSVs `_ , but can also be modified to support non-standard CSVs (with multi-character delimiters, unquoted fields, etc). All CSV files should be prepared according to these recommendations: * Files are UTF-8 or ASCII encoded * Field delimiter is an ASCII character or characters * Record delimiter, also known as a new line separator, is a Unix-style newline (``\n``), DOS-style newline (``\r\n``), or Mac style newline (``\r``). * Fields are optionally enclosed by double-quotes, or mandatory quoted if they contain one of the following characters: * The record delimiter or field delimiter * A double quote character * A newline * If a field is quoted, any double quote that appears must be double-quoted (similar to the :ref:`string literals quoting rules`. For example, to encode ``What are "birds"?``, the field should appear as ``"What are ""birds""?"``. Other modes of escaping are not supported (e.g. ``1,"What are \"birds\"?"`` is not a valid way of escaping CSV values). Marking Null Markers -------------------- ``NULL`` values can be marked in two ways in the CSV: * An explicit null marker. For example, ``col1,\N,col3`` * An empty field delimited by the field delimiter. For example, ``col1,,col3`` .. note:: If a text field is quoted but contains no content (``""``) it is considered an empty text field. It is not considered ``NULL``. .. _copy_date_parsers: Supported Date Formats ====================== .. list-table:: Supported date parsers :widths: auto :header-rows: 1 * - Name - Pattern - Examples * - ``ISO8601``, ``DEFAULT`` - ``YYYY-MM-DD [hh:mm:ss[.SSS]]`` - ``2017-12-31 11:12:13.456``, ``2018-11-02 11:05:00``, ``2019-04-04`` * - ``ISO8601C`` - ``YYYY-MM-DD [hh:mm:ss[:SSS]]`` - ``2017-12-31 11:12:13:456`` * - ``DMY`` - ``DD/MM/YYYY [hh:mm:ss[.SSS]]`` - ``31/12/2017 11:12:13.123`` * - ``YMD`` - ``YYYY/MM/DD [hh:mm:ss[.SSS]]`` - ``2017/12/31 11:12:13.678`` * - ``MDY`` - ``MM/DD/YYYY [hh:mm:ss[.SSS]]`` - ``12/31/2017 11:12:13.456`` * - ``YYYYMMDD`` - ``YYYYMMDD[hh[mm[ss[SSS]]]]`` - ``20171231111213456`` * - ``YYYY-M-D`` - ``YYYY-M-D[ h:m[:s[.S]]]`` - ``2017-9-10 10:7:21.1`` (optional leading zeroes) * - ``YYYY/M/D`` - ``YYYY/M/D[ h:m[:s[.S]]]`` - ``2017/9/10 10:7:21.1`` (optional leading zeroes) * - ``DD-mon-YYYY`` - ``DD-mon-YYYY[ hh:mm[:ss[.SSS]]]`` - ``31-Dec-2017 11:12:13.456`` * - ``YYYY-mon-DD`` - ``YYYY-mon-DD[ hh:mm[:ss[.SSS]]]`` - ``2017-Dec-31 11:12:13.456`` .. list-table:: :widths: auto :header-rows: 1 * - Pattern - Description * - ``YYYY`` - four digit year representation (0000-9999) * - ``MM`` - two digit month representation (01-12) * - ``DD`` - two digit day of month representation (01-31) * - ``m`` - short month representation (Jan-Dec) * - ``a`` - short day of week representation (Sun-Sat). * - ``hh`` - two digit 24 hour representation (00-23) * - ``h`` - two digit 12 hour representation (00-12) * - ``P`` - uppercase AM/PM representation * - ``mm`` - two digit minute representation (00-59) * - ``ss`` - two digit seconds representation (00-59) * - ``SSS`` - 3 digits fraction representation for milliseconds (000-999) .. note:: These date patterns are not the same as date parts used in the :ref:`datepart` function. Examples ======== Skipping Faulty Rows -------------------- .. code-block:: postgres COPY new_nba FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', CONTINUE_ON_ERROR = true ); -- Skipping a maximum of 100 faulty rows COPY new_nba FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', CONTINUE_ON_ERROR = true, ERROR_COUNT = 100 ); Loading a Pipe and Tab Separated Value Files -------------------------------------------- .. code-block:: postgres -- Pipe separated COPY new_nba FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', DELIMITER = '|' ); -- Tab separated COPY new_nba FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', DELIMITER = '\t' ); Ingesting Data from different File Formats ------------------------------------------ This example demonstrates data ingestion from a JSON file format, which can be adapted for all file formats supported by BLUE. You can achieve this by adjusting the data wrapper and file name extension to match the relevant file format. .. code-block:: postgres COPY new_nba FROM WRAPPER json_fdw OPTIONS (LOCATION = 's3://sqream-docs/nba.json'); Loading Data from Cloud Storage ------------------------------- This example demonstrates data ingestion from S3, which can be adapted for all cloud storage platforms supported by BLUE. You can achieve this by adjusting the ``LOCATION`` parameter with the relevant platform bucket and file URI. .. code-block:: postgres COPY new_nba FROM WRAPPER csv_fdw OPTIONS (LOCATION = 's3://sqream-docs/nba.csv'); Using Credentials ----------------- .. code-block:: postgres COPY new_nba FROM WRAPPER json_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.json', AWS_ID = '12345678', AWS_SECRET = 'super_secretive_secret' ); Loading a Text File with Non-Printable Delimiters ------------------------------------------------- In the file below, the separator is ``DC1``, which is represented by ASCII 17 decimal or 021 octal. .. code-block:: postgres COPY new_nba FROM WRAPPER psv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.psv', DELIMITER = E '\021' ); Loading a Text File with Multi-Character Delimiters --------------------------------------------------- .. code-block:: postgres -- In the file below, the separator is ^| COPY new_nba FROM WRAPPER psv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.psv', DELIMITER = '^|' ); -- In the file below, the separator is ``'|``. The quote character has to be repeated COPY new_nba FROM WRAPPER psv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.psv', DELIMITER = ''''|' ); Dealing with a Header Rows -------------------------- Use ``OFFSET`` to skip rows. .. note:: When loading multiple files (e.g. with wildcards), this setting affects each file separately. .. code-block:: postgres COPY new_nba FROM WRAPPER json_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.json', DELIMITER = '|', OFFSET = 2 ); Using the ``DELETE_SOURCE_ON_SUCCESS`` Parameter ------------------------------------------------ .. code-block:: sql -- Single file: COPY new_nba FROM WRAPPER json_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.json', DELETE_SOURCE_ON_SUCCESS = true ); -- Multiple files: COPY new_nba FROM WRAPPER json_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba*.json', DELETE_SOURCE_ON_SUCCESS = true ); Saving Rejected Rows to a File ---------------------------------- When loading multiple files (e.g. with wildcards), this error threshold is for the entire transaction. .. code-block:: postgres COPY new_nba FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', CONTINUE_ON_ERROR = true, ERROR_LOG = 's3://sqream-docs/log.csv' ); COPY new_nba FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', DELIMITER = '|', ERROR_LOG = 's3://sqream-docs/log.csv' -- Save error log, REJECTED_DATA = 's3://sqream-docs/load_rejected.log', -- Only save rejected rows LIMIT = 100, -- Only load 100 rows ERROR_COUNT = 5 -- Stop the load if 5 errors reached ); Rearranging Destination Columns --------------------------------- When the source of the files does not match the table structure, tell the ``COPY`` command what the order of columns should be .. code-block:: postgres COPY new_nba (fifth, first, third) FROM WRAPPER csv_fdw OPTIONS (LOCATION = 's3://sqream-docs/nba.csv'); .. note:: Any column not specified will revert to its default value or ``NULL`` value if nullable Loading Non-Standard Dates -------------------------- If your files contain dates in a format other than ``ISO8601``, you can specify a :ref:`parsing` format to convert them during the import process. This ensures the dates are stored internally as ``ISO8601`` within the database. In this example, ``date_col1`` and ``date_col2`` in the table are non-standard. ``date_col3`` is mentioned explicitly, but can be left out. Any column that is not specified is assumed to be ``ISO8601``. .. code-block:: postgres COPY new_nba (date_col1, date_col2, date_col3) FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', DATETIME_FORMAT = 'DMY' ); Customizing Quotations Using Alternative Characters --------------------------------------------------- .. code-block:: postgres COPY t FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', QUOTE='@' ); Customizing Quotations Using ASCII Character Codes -------------------------------------------------- .. code-block:: postgres COPY t FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', QUOTE=E'\064' ); Multi-Character Delimiters -------------------------- Multi-character field delimiters, sometimes found in non-standard files, are supported. .. code-block:: postgres -- Setting %% as a delimiter COPY t FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', DELIMITER = '%%' ); -- Setting {~} as a delimiter COPY t FROM WRAPPER csv_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.csv', DELIMITER = '{~}' ); Loading Specific Columns ------------------------ Loading specific columns using the ``COPY FROM`` command: * Does not support CSV files * Requires that the target table columns be nullable .. code-block:: postgres COPY new_nba (player_name, salary) FROM WRAPPER parquet_fdw OPTIONS ( LOCATION = 's3://sqream-docs/nba.parquet' ); Permissions =========== The role must have the ``INSERT`` permission to the destination table.