Deleting Data
The Deleting Data page describes how the Delete statement works and how to maintain data that you delete:
Overview
Deleting data typically refers to deleting rows, but can refer to deleting other table content as well. The general workflow for deleting data is to delete data followed by triggering a cleanup operation. The cleanup operation reclaims the space occupied by the deleted rows, discussed further below.
The DELETE statement deletes rows defined by a predicate that you have specified, preventing them from appearing in subsequent queries.
For example, the predicate below defines and deletes rows containing animals heavier than 1000 weight units:
farm=> DELETE FROM cool_animals WHERE weight > 1000;
The major benefit of the DELETE statement is that it deletes transactions simply and quickly.
The Deletion Process
Deleting rows occurs in the following two phases:
Phase 1 - Deletion - All rows you mark for deletion are ignored when you run any query. These rows are not deleted until the clean-up phase.
Phase 2 - Clean-up - The rows you marked for deletion in Phase 1 are physically deleted. The clean-up phase is not automated, letting users or DBAs control when to activate it. The files you marked for deletion during Phase 1 are removed from disk, which you do by by sequentially running the utility function commands
CLEANUP_CHUNKS
andCLEANUP_EXTENTS
.
Usage Notes
The Usage Notes section includes important information about the DELETE statement:
General Notes
This section describes the general notes applicable when deleting rows:
The ALTER TABLE command and other DDL operations are locked on tables that require clean-up. If the estimated clean-up time exceeds the permitted threshold, an error message is displayed describing how to override the threshold limitation. For more information, see Concurrency and Locks.
If the number of deleted records exceeds the threshold defined by the
mixedColumnChunksThreshold
parameter, the delete operation is aborted. This alerts users that the large number of deleted records may result in a large number of mixed chunks. To circumvent this alert, use the following syntax (replacingXXX
with the desired number of records) before running the delete operation:set mixedColumnChunksThreshold=XXX;
Deleting Data does not Free Space
With the exception of running a full table delete, deleting data does not free unused disk space. To free unused disk space you must trigger the clean-up process.
For more information on running a full table delete, see TRUNCATE.
For more information on freeing disk space, see Triggering a Clean-Up.
Clean-Up Operations Are I/O Intensive
The clean-up process reduces table size by removing all unused space from column chunks. While this reduces query time, it is a time-costly operation occupying disk space for the new copy of the table until the operation is complete.
Tip
Because clean-up operations can create significant I/O load on your database, consider using them sparingly during ideal times.
If this is an issue with your environment, consider using CREATE TABLE AS
to create a new table and then rename and drop the old table.
Examples
The Examples section includes the following examples:
Deleting Rows from a Table
The following example shows how to delete rows from a table.
Display the table:
farm=> SELECT * FROM cool_animals;
The following table is displayed:
1,Dog ,7 2,Possum ,3 3,Cat ,5 4,Elephant ,6500 5,Rhinoceros ,2100 6,\N,\N
Delete rows from the table:
farm=> DELETE FROM cool_animals WHERE weight > 1000;
Display the table:
farm=> SELECT * FROM cool_animals;
The following table is displayed:
1,Dog ,7 2,Possum ,3 3,Cat ,5 6,\N,\N
Deleting Values Based on Complex Predicates
The following example shows how to delete values based on complex predicates.
Display the table:
farm=> SELECT * FROM cool_animals;
The following table is displayed:
1,Dog ,7 2,Possum ,3 3,Cat ,5 4,Elephant ,6500 5,Rhinoceros ,2100 6,\N,\N
Delete rows from the table:
farm=> DELETE FROM cool_animals WHERE weight > 1000;
Display the table:
farm=> SELECT * FROM cool_animals;
The following table is displayed:
1,Dog ,7 2,Possum ,3 3,Cat ,5 6,\N,\N
Identifying and Cleaning Up Tables
The Identifying and Cleaning Up Tables section includes the following examples:
Listing Tables that Have Not Been Cleaned Up
The following example shows how to list tables that have not been cleaned up:
farm=> SELECT t.table_name FROM sqream_catalog.delete_predicates dp
JOIN sqream_catalog.tables t
ON dp.table_id = t.table_id
GROUP BY 1;
cool_animals
1 row
Identifying Predicates for Clean-Up
The following example shows how to identify predicates for clean-up:
farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp
JOIN sqream_catalog.tables t
ON dp.table_id = t.table_id
WHERE t.table_name = 'cool_animals';
weight > 1000
1 row
Triggering a Clean-Up
The following example shows how to trigger a clean-up:
Run the chunk
CLEANUP_CHUNKS
command (also known asSWEEP
) to reorganize the chunks:farm=> SELECT CLEANUP_CHUNKS('public','cool_animals');
Run the
CLEANUP_EXTENTS
command (also known asVACUUM
) to delete the leftover files:farm=> SELECT CLEANUP_EXTENTS('public','cool_animals');
Display the table:
farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp JOIN sqream_catalog.tables t ON dp.table_id = t.table_id WHERE t.table_name = 'cool_animals';
Best Practices
This section includes the best practices when deleting rows:
Run
CLEANUP_CHUNKS
andCLEANUP_EXTENTS
after running largeDELETE
operations.When you delete large segments of data from very large tables, consider running a
CREATE TABLE AS
operation instead, renaming, and dropping the original table.Avoid killing
CLEANUP_EXTENTS
operations in progress.SQream is optimized for time-based data, which is data naturally ordered according to date or timestamp. Deleting rows based on such columns leads to increased performance.