Table partitioning

Table partitioning is the technique of dividing table data into distinct chunks known as partitions. When the database engine only needs to operate on a small portion of the data, it significantly increases the overall performance, making querying, inserting, and deleting data more efficient.

Managing partitions

Every table contains a default partition named _default where data not belonging to a user-defined partition is stored. The _default partition is created when data is first inserted into a table. When the SHOW PARTITIONS command is used to display partitions the default partition is displayed with an empty name field.

Result when showing the _default partition
{
  "name": "",
  "state": "attached",
  "record_count": 3,
  "path": "testdatabase/testtable/_default",
  "size": 80
}

You can create custom table partitions by specifying the partition name manually or using SQL expressions. In addition, you can list, delete, and take partitions offline. See the SHOW PARTITIONS and ALTER TABLE commands.

When changing a partition expression, a new default partition is set to be created with the same _default name but in a different directory. As mentioned above, the new default partition is only created when data is inserted into it for the first time.

Partition data is stored in its parent table’s directory, as shown in the image below. In addition to the partition data directory, two special folders exist; the _default folder contains the unpartitioned data, and the _detached folder contains partitions that have been detached.

Raijin table data folder structure
Figure 1. Table partitions structure on disk

User-defined partition folders are named using the result of the partition expression. You can rename partition folders when Raijin is stopped or if a partition is detached. Once a partition is detached, you can remove its folder from the _detached folder and copy it back if you need to attach it again, such as for archiving and unarchiving data. It is important to remember that a partition belongs to the table it was created for and cannot be attached to a different table.

Partition names are not unique. However, partition folder names are. For example, it is possible to detach a partition "x" and then insert data into a partition named "x". A new partition, "x," will be created in this case. If you later decide to attach the original partition "x" again, the table will have two partitions named "x" with different folder names.

Do not move or rename partition folders outside the _detached folder while Raijin is running. Doing so will render the database inconsistent and cause irreparable damage. We strongly advise always using the ALTER TABLE command to attach, detach, and delete partitions to avoid data loss.

Raijin includes a directory keyword feature. When using the ATTACH DETACH and DROP statements it is enough to specify the name of the directory, rather than the full path.

Detaching a partition using a directory keyword
ALTER TABLE testtable DETACH PARTITION DIRECTORY "partition_data";

Manual partitioning

When inserting data in a table, you can manually create partitions by specifying the partition name.

Example 1. Defining table partitions manually
Insert data into two partitions, "foo" and "bar"
CREATE TABLE tbl();
INSERT INTO tbl(event) VALUES ('a') WITH (partition = 'foo');
INSERT INTO tbl(event) VALUES ('b') WITH (partition = 'bar');
INSERT INTO tbl(event) VALUES ('c') WITH (partition = 'foo');

The record is stored in the root partition if you don’t specify a partition name.

Insert data into the root partition
INSERT INTO tbl(event) VALUES ('x');
Retrieve data from all partitions
SELECT event FROM tbl;
Result when all partitions are attached
{"event":"a"}
{"event":"b"}
{"event":"c"}
{"event":"x"}

Using the DETACH PARTITION clause of the ALTER TABLE command takes the partition data offline, making it unavailable for querying.

Detach the "bar" partition and retrieve data again
ALTER TABLE tbl DETACH PARTITION "bar";
SELECT event FROM tbl;
Result when "bar" is detached
{"event":"a"}
{"event":"c"}
{"event":"x"}

Use the ATTACH PARTITION clause to restore a detached partition.

Attach the "bar" partition and detach the "foo" partition
ALTER TABLE tbl ATTACH PARTITION "bar";
ALTER TABLE tbl DETACH PARTITION "foo";
SELECT event FROM tbl;
Result when "foo" is detached and "bar" is attached
{"event":"b"}
{"event":"x"}

If you detach all custom partitions, only the data in the default partition remains accessible.

Detach custom partitions and retrieve data
ALTER TABLE tbl DETACH PARTITION "foo";
ALTER TABLE tbl DETACH PARTITION "bar";
SELECT event FROM tbl;
Result when "foo" and "bar" are detached
{"event":"x"}

Expression-based partitioning

Expression-based partitioning is a convenient way to partition tables dynamically. When inserting data, an expression specified with the PARTITION BY clause of the INSERT command determines the partition name. For example, a common practice is partitioning tables by time, such as the day or month a record is received or generated in the case of log records. However, you can use any combination of fields in your expression to define partition names.

Example 2. Creating a dynamically partitioned table by date

This example uses the DATE function to extract the date from the EventTime field, which contains DATETIME values.

The PARTITION BY clause specifies the DATE(EventTime) expression on table creation. As a result, the date portion of the EventTime field determines the partition where each record will be stored.

Create a table with date-based partitioning
CREATE TABLE tbl () PARTITION BY DATE(EventTime);
INSERT INTO tbl(event, EventTime) VALUES('a', '2023-02-04 17:14:04'); (1)
INSERT INTO tbl(event, EventTime) VALUES('b', '2023-02-04 17:14:10'); (2)
INSERT INTO tbl(event, EventTime) VALUES('c', '2023-02-05 17:14:04'); (3)
1 Dynamically assigned to partition 2023-02-04
2 Dynamically assigned to partition 2023-02-04
3 Dynamically assigned to partition 2023-02-05
Query all data from all attached partitions
SELECT event, EventTime FROM tbl;
Result when all partitions are attached
{"event":"a","eventtime":"2023-02-04 17:14:04"}
{"event":"b","eventtime":"2023-02-04 17:14:10"}
{"event":"c","eventtime":"2023-02-05 17:14:04"}
Detach partition 2023-02-04 and retrieve all data
ALTER TABLE tbl DETACH PARTITION "2023-02-04";
SELECT event, EventTime FROM tbl;
Result when partition 2023-02-04 is detached
{"event":"c","eventtime":"2023-02-05 17:14:04"}
Attach partition 2023-02-04 and detach partition 2023-02-05
ALTER TABLE tbl ATTACH PARTITION "2023-02-04";
ALTER TABLE tbl DETACH PARTITION "2023-02-05";
SELECT event, EventTime FROM tbl;
Result when partition 2023-02-05 is detached and partition 2023-02-04 is attached
{"event":"a","eventtime":"2023-02-04 17:14:04"}
{"event":"b","eventtime":"2023-02-04 17:14:10"}