Table partitioning
Table partitioning is the technique of dividing table data into distinct chunks known as partitions. When the database engine only needs to operate on a small portion of the data, it significantly increases the overall performance, making querying, inserting, and deleting data more efficient.
Managing partitions
Every table contains a default partition named _default
where data not belonging to a user-defined partition is stored.
The _default
partition is created when data is first inserted into a table.
When the SHOW PARTITIONS
command is used to display partitions the default
partition is displayed with an empty name field.
_default
partition{
"name": "",
"state": "attached",
"record_count": 3,
"path": "testdatabase/testtable/_default",
"size": 80
}
You can create custom table partitions by specifying the partition name manually or using SQL expressions. In addition, you can list, delete, and take partitions offline. See the SHOW PARTITIONS and ALTER TABLE commands.
When changing a partition expression, a new default partition is set to be created with the same _default
name but in a different directory.
As mentioned above, the new default partition is only created when data is inserted into it for the first time.
Partition data is stored in its parent table’s directory, as shown in the image below.
In addition to the partition data directory, two special folders exist; the _default
folder contains the unpartitioned data, and the _detached
folder contains partitions that have been detached.

User-defined partition folders are named using the result of the partition expression.
You can rename partition folders when Raijin is stopped or if a partition is detached.
Once a partition is detached, you can remove its folder from the _detached
folder and copy it back if you need to attach it again, such as for archiving and unarchiving data.
It is important to remember that a partition belongs to the table it was created for and cannot be attached to a different table.
Partition names are not unique. However, partition folder names are. For example, it is possible to detach a partition "x" and then insert data into a partition named "x". A new partition, "x," will be created in this case. If you later decide to attach the original partition "x" again, the table will have two partitions named "x" with different folder names.
Do not move or rename partition folders outside the _detached folder while Raijin is running.
Doing so will render the database inconsistent and cause irreparable damage.
We strongly advise always using the ALTER TABLE command to attach, detach, and delete partitions to avoid data loss.
|
Raijin includes a directory keyword feature. When using the ATTACH
DETACH
and DROP
statements it is enough to specify the name of the directory, rather than the full path.
ALTER TABLE testtable DETACH PARTITION DIRECTORY "partition_data";
Manual partitioning
When inserting data in a table, you can manually create partitions by specifying the partition name.
CREATE TABLE tbl();
INSERT INTO tbl(event) VALUES ('a') WITH (partition = 'foo');
INSERT INTO tbl(event) VALUES ('b') WITH (partition = 'bar');
INSERT INTO tbl(event) VALUES ('c') WITH (partition = 'foo');
The record is stored in the root partition if you don’t specify a partition name.
INSERT INTO tbl(event) VALUES ('x');
SELECT event FROM tbl;
{"event":"a"}
{"event":"b"}
{"event":"c"}
{"event":"x"}
Using the DETACH PARTITION
clause of the ALTER TABLE
command takes the partition data offline, making it unavailable for querying.
ALTER TABLE tbl DETACH PARTITION "bar";
SELECT event FROM tbl;
{"event":"a"}
{"event":"c"}
{"event":"x"}
Use the ATTACH PARTITION
clause to restore a detached partition.
ALTER TABLE tbl ATTACH PARTITION "bar";
ALTER TABLE tbl DETACH PARTITION "foo";
SELECT event FROM tbl;
{"event":"b"}
{"event":"x"}
If you detach all custom partitions, only the data in the default partition remains accessible.
ALTER TABLE tbl DETACH PARTITION "foo";
ALTER TABLE tbl DETACH PARTITION "bar";
SELECT event FROM tbl;
{"event":"x"}
Expression-based partitioning
Expression-based partitioning is a convenient way to partition tables dynamically.
When inserting data, an expression specified with the PARTITION BY
clause of the INSERT
command determines the partition name.
For example, a common practice is partitioning tables by time, such as the day or month a record is received or generated in the case of log records.
However, you can use any combination of fields in your expression to define partition names.
This example uses the DATE function to extract the date from the EventTime
field, which contains DATETIME values.
The PARTITION BY
clause specifies the DATE(EventTime)
expression on table creation.
As a result, the date portion of the EventTime
field determines the partition where each record will be stored.
CREATE TABLE tbl () PARTITION BY DATE(EventTime);
INSERT INTO tbl(event, EventTime) VALUES('a', '2023-02-04 17:14:04'); (1)
INSERT INTO tbl(event, EventTime) VALUES('b', '2023-02-04 17:14:10'); (2)
INSERT INTO tbl(event, EventTime) VALUES('c', '2023-02-05 17:14:04'); (3)
1 | Dynamically assigned to partition 2023-02-04 |
2 | Dynamically assigned to partition 2023-02-04 |
3 | Dynamically assigned to partition 2023-02-05 |
SELECT event, EventTime FROM tbl;
{"event":"a","eventtime":"2023-02-04 17:14:04"}
{"event":"b","eventtime":"2023-02-04 17:14:10"}
{"event":"c","eventtime":"2023-02-05 17:14:04"}
ALTER TABLE tbl DETACH PARTITION "2023-02-04";
SELECT event, EventTime FROM tbl;
{"event":"c","eventtime":"2023-02-05 17:14:04"}
ALTER TABLE tbl ATTACH PARTITION "2023-02-04";
ALTER TABLE tbl DETACH PARTITION "2023-02-05";
SELECT event, EventTime FROM tbl;
{"event":"a","eventtime":"2023-02-04 17:14:04"}
{"event":"b","eventtime":"2023-02-04 17:14:10"}