.. _json: **** JSON **** JSON (Java Script Object Notation) is used both as a file format and as a serialization method. The JSON file format is flexible and is commonly used for dynamic, nested, and semi-structured data representations. The BLUE JSON parser supports the `RFC 8259 `_ data interchange format and supports both JSON objects and JSON object arrays. Only the `JSON Lines `_ data format is supported by BLUE. Foreign Data Wrapper Prerequisites ================================== Before proceeding, ensure the following Foreign Data Wrapper (FDW) prerequisites: * **File Existence:** Verify that the file you are ingesting data from exists at the specified path. * **Path Accuracy:** Confirm that all path elements are present and correctly spelled. Any inaccuracies may lead to data retrieval issues. * **Bucket Access Permissions:** Ensure that you have the necessary access permissions to the bucket from which you are ingesting data. Lack of permissions can hinder the data retrieval process. * **Wildcard Accuracy:** If using wildcards, double-check their spelling and configuration. Misconfigured wildcards may result in unintended data ingestion. Mapping between JSON and BLUE ============================= A JSON field consists of a key name and a value. Key names, which are case sensitive, are mapped to BLUE columns. Key names which do not have corresponding BLUE table columns are treated as errors by default, unless the ``IGNORE_EXTRA_FIELDS`` parameter is set to ``true``, in which case these key names will be ignored during the mapping process. BLUE table columns which do not have corresponding JSON fields are automatically set to ``null`` as a value. Values may be one of the following reserved words (lower-case): ``false``, ``true``, or ``null``, or any of the following data types: .. list-table:: :widths: auto :header-rows: 1 * - JSON Data Type - Representation in BLUE * - Number - ``TINYINT``, ``SMALLINT``, ``INT``, ``BIGINT``, ``FLOAT``, ``DOUBLE``, ``NUMERIC`` * - String - ``TEXT`` * - JSON Literal - ``NULL``, ``TRUE``, ``FALSE`` * - JSON Array - ``TEXT`` * - JSON Object - ``TEXT`` Character Escaping ------------------ The ASCII 10 character (LF) marks the end of JSON objects. Use ``\\n`` to escape the ``\n`` character when you do not mean it be a new line. Parameters ------------ The following parameters are supported by ``json_fdw``: .. list-table:: :widths: auto :header-rows: 1 * - Parameter - Description * - ``DATETIME_FORMAT`` - Default format is ``yyyy-mm-dd``. Other supported date formats are:``iso8601``, ``iso8601c``, ``dmy``, ``ymd``, ``mdy``, ``yyyymmdd``, ``yyyy-m-d``, ``yyyy-mm-dd``, ``yyyy/m/d``, ``yyyy/mm/dd``, ``d/m/yyyy``, ``dd/mm/yyyy``, ``mm/dd/yyyy``, ``dd-mon-yyyy``, ``yyyy-mon-dd``. * - ``IGNORE_EXTRA_FIELDS`` - Default value is ``false``. When value is ``true``, key names which do not have corresponding BLUE table columns will be ignored. Parameter may be used with the ``COPY TO`` and ``IGNORE FOREIGN TABLE`` statements. * - ``COMPRESSION`` - Supported values are ``auto``, ``gzip``, and ``none``. ``auto`` means that the compression type is automatically detected upon import. Parameter is not supported for exporting. ``gzip`` means that a ``gzip`` compression is applied. ``none`` means that no compression or an attempt to decompress will take place. * - ``LOCATION`` - A path on the local filesystem, on S3, or on HDFS URI. The local path must be an absolute path that BLUE can access. * - ``LIMIT`` - When specified, tells BLUE to stop ingesting after the specified number of rows. Unlimited if unset. * - ``OFFSET`` - The row number from which to start ingesting. * - ``ERROR_LOG`` - If when using the ``COPY`` command, copying a row fails, the ``ERROR LOG`` command writes error information to the error log specified in the ``ERROR LOG`` command. * If an existing file path is specified, the file will be overwritten. * Specifying the same file for ``ERROR_LOG`` and ``REJECTED_DATA`` is not allowed and will result in error. * Specifying an error log when creating a foreign table will write a new error log for every query on the foreign table. * - ``CONTINUE_ON_ERROR`` - Specifies if errors should be ignored or skipped. When set to true, the transaction will continue despite rejected data. This parameter should be set together with ``ERROR_COUNT``. When reading multiple files, if an entire file cannot be opened, it will be skipped. * - ``ERROR_COUNT`` - Specifies the maximum number of faulty records that will be ignored. This setting must be used in conjunction with ``continue_on_error``. * - ``MAX_FILE_SIZE`` - Sets the maximum file size (bytes). * - ``ENFORCE_SINGLE_FILE`` - Permitted values are ``true`` or ``false``. When set to ``true``, a single file of unlimited size is created. This single file is not limited by the ``MAX_FILE_SIZE`` parameter. ``false`` permits creating several files together limited by the ``MAX_FILE_SIZE`` parameter. Default value: ``false``. * - ``AWS_ID``, ``AWS_SECRET`` - Specifies the authentication details for secured S3 buckets. Automatic Schema Inference -------------------------- You may let BLUE automatically infer the schema of a foreign table when using ``json_fdw``. Automatic Schema Inference example: .. code-block:: postgres CREATE FOREIGN TABLE t WRAPPER json_fdw OPTIONS (LOCATION = 's3://sqream-docs/nba.json'); Examples -------- JSON object array: .. code-block:: json { "name":"Avery Bradley", "age":25, "position":"PG" } { "name":"Jae Crowder", "age":25, "position":"PG" } { "name":"John Holland", "age":27, "position":"SG" } JSON objects: .. code-block:: json [ { "name":"Avery Bradley", "age":25, "position":"PG" }, { "name":"Jae Crowder", "age":25, "position":"SF" }, { "name":"John Holland", "age":27, "position":"SG" } ] When using the ``CREATE FOREIGN TABLE`` statement, make sure that the table schema corresponds with the JSON file structure. .. code-block:: postgres CREATE FOREIGN TABLE t (ID INT NOT NULL) WRAPPER json_fdw OPTIONS (location = 's3://sqream-docs/nba.json'); .. tip:: An exact match must exist between the BLUE and JSON types. For unsupported column types, you can set the type to any type and exclude it from subsequent queries.