Table partitioning
Table partitioning is the technique of dividing table data into distinct chunks known as partitions. Partitions allow the database engine to operate on smaller portions of the data, significantly increasing the overall performance and making querying, inserting, and deleting data more efficient.
Raijin supports three types of table partitioning:
-
Manual partitions, where you define the partition of each record in the INSERT command.
-
Multidimensional partitions, where you define the partition field(s) when you create the table.
-
Finally, since partitioning is optional, you can have tables without partitions.
Manual partitioning
Tables with manual partitions give you complete control over partition names and their content. On the other hand, manual partitioning requires explicitly deciding the partition, if any, incoming data belongs to on insertion.
To add data to a partition, you must define the partition using the PARTITION BY
option of the INSERT command.
If a partition with that name already exists, the data is added to it.
A new partition is created if there is no partition with that name.
Data inserted without a partition will be stored in the default partition. You can see this partition with an empty name field when you list table partitions with the SHOW PARTITIONS command.
CREATE TABLE logs (
"Source" STRING,
"Hostname" STRING
);
INSERT INTO logs VALUES ('a','host1') WITH (partition = 'foo');
INSERT INTO logs VALUES ('b','host2') WITH (partition = 'bar');
INSERT INTO logs VALUES ('c','host1') WITH (partition = 'foo');
INSERT INTO logs VALUES ('d','host2');
SELECT * FROM logs;
SHOW PARTITIONS IN logs;
This example results in a table with four records.
Two of the records will be in a partition named foo
, one record in a partition named bar
, and one record that is not in any partition.
{"_id":1,"Source":"a","Hostname":"host1"}
{"_id":3,"Source":"c","Hostname":"host1"}
{"_id":2,"Source":"b","Hostname":"host2"}
{"_id":4,"Source":"d","Hostname":"host2"}
{}
{"name":"foo","state":"attached","record_count":2,"path":"testdb/logs/foo","size":83}
{"name":"bar","state":"attached","record_count":1,"path":"testdb/logs/bar","size":83}
{"name":"","state":"attached","record_count":1,"path":"testdb/logs/_default","size":78}
Multidimensional partitioning
Multidimensional partitions are based on one or more table fields (partition dimensions) you define when creating the table. Partitions are created when records are inserted in the table if none of the existing partitions match the new data.
CREATE TABLE logs (
"EventTime" TIMESTAMP,
"Source" STRING,
"Message" STRING,
"Hostname" STRING
)
PARTITION BY Source, Hostname;
The example above creates a table partitioned by combinations of the Source
and Hostname
fields.
The partitions enable Raijin to operate on a subset of the data segmented by the intersection of those values whenever possible.
This partition setup optimizes processing SELECT queries referencing a specific source, group of sources, or any combination of source and hostname since Raijin only needs to operate on partitions containing a matching source.
For example, these partitions make purging data for a phased-out Hostname
easier.
Results do not have a default order when retrieving data from partitioned tables.
If you require ordered results, use the |
Selecting your partitions
Database performance depends on partitioning tables by fields that suit your needs. Your partitions should not be too broad or too granular. Splitting a table into very big or too many partitions will not yield any performance benefits. In such cases, Raijin may have to scan large chunks of irrelevant data, hindering performance.
Additionally, ensure that you partition tables by fields you frequently use to filter data so that Raijin works on smaller datasets and can process queries faster, e.g., during a SELECT
operation.
Timestamp fields are often good candidates for defining table partitions. They are commonly used for filtering when querying or purging data. Yet, timestamps are very granular and can be up to a microsecond. Partitioning data by such a field would defeat the purpose of partitions because most - if not all - records would be in different partitions. When partitioning by timestamp fields, we recommend using the DATE_TRUNC function. This function allows you to reduce the partition by a suitable time fraction, such as the week or month.
For example, partitioning by Example of table partitioning by day
|
In conclusion, consider the following when choosing your partitioning fields:
-
Look for fields you commonly use for filtering queries to reduce the dataset Raijin works on.
-
Consider relevant fields for pruning data, such as truncated timestamps, to simplify administrative tasks, including deleting older data from your database.
-
Avoid fields with low variance, resulting in a small number of partitions, unless you use them in combination with other fields.
-
Avoid fields with very high variance, such as granular timestamps, because it will result in too many partitions (potentially a partition for each record).
This example uses the DATE_TRUNC function to extract the date from the EventTime
field, comprising TIMESTAMP values.
The PARTITION BY
clause specifies the DATE_TRUNC('day', EventTime)
expression on table creation.
As a result, the date portion of the EventTime
field determines the partition where each record will be stored.
CREATE TABLE logs (
"EventTime" TIMESTAMP,
"event" STRING
) PARTITION BY DATE_TRUNC('day', EventTime);
INSERT INTO logs(event, EventTime) VALUES('a', '2023-02-04 17:14:04');
INSERT INTO logs(event, EventTime) VALUES('b', '2023-02-04 17:14:10');
INSERT INTO logs(event, EventTime) VALUES('c', '2023-02-05 17:14:04');
SELECT * FROM logs;
SHOW PARTITIONS IN logs;
This example results in a table with three records in two partitions
{"_id":1,"event":"a","eventtime":"2023-02-04 17:14:04"}
{"_id":2,"event":"b","eventtime":"2023-02-04 17:14:10"}
{"_id":3,"event":"c","eventtime":"2023-02-05 17:14:04"}
{}
{"name":"2023-02-04T01:00:00.000000+01:00","state":"attached","record_count":2,"path":"testdb/logs/2023-02-04T01%3A00%3A00.000000+01%3A00 (DATE_TRUNC('day', EventTime))","size":208,"eventtime":"2023-02-04T01:00:00.000000+01:00"}
{"name":"2023-02-05T01:00:00.000000+01:00","state":"attached","record_count":1,"path":"testdb/logs/2023-02-05T01%3A00%3A00.000000+01%3A00 (DATE_TRUNC('day', EventTime))","size":208,"eventtime":"2023-02-05T01:00:00.000000+01:00"}
Managing partitions
You can list, delete, and take partitions offline. See the SHOW PARTITIONS and ALTER TABLE commands.
Raijin stores partition data in its parent table’s data directory. Other than folders for each partition, it might create two additional folders:
-
The
_detached
folder if there are any detached partitions. -
The
_default
folder for unpartitioned data, e.g., if data is inserted without a partition for manually partitioned tables or the table was originally created without partitions.
Detach and attach partitions
The ATTACH PARTITION
, DETACH PARTITION
, and DROP PARTITION
clauses of the ALTER TABLE command support the DIRECTORY
keyword, which allows you to specify the folder name instead of the folder path.
Once a partition is detached, you can remove its folder from the _detached
folder and copy it back if you need to attach it again, such as for archiving and unarchiving data.
ALTER TABLE tbl DETACH PARTITION DIRECTORY "partition_data";
Do not move or rename partition folders outside the _detached folder while Raijin is running.
Doing so will render the database inconsistent and cause irreparable damage.
We strongly advise always using the ALTER TABLE command to attach, detach, and delete partitions to avoid data loss.
|
Using the DETACH PARTITION
clause of the ALTER TABLE
command takes the partition data offline, making it unavailable for querying.
CREATE TABLE logs (
"event" STRING,
"hostname" STRING
) PARTITION BY "hostname";
INSERT INTO logs VALUES ('a','foo');
INSERT INTO logs VALUES ('b','bar');
INSERT INTO logs VALUES ('c','x');
ALTER TABLE logs DETACH PARTITION "bar";
SELECT * FROM logs;
{"_id":1,"event":"a","hostname":"foo"}
{"_id":3,"event":"c","hostname":"x"}
Use ATTACH PARTITION
to restore a detached partition.
It is important to remember that a partition belongs to the table it was created for and cannot be attached to a different table.
ALTER TABLE logs ATTACH PARTITION "bar";
SELECT * FROM logs;
{"_id":1,"event":"a","hostname":"foo"}
{"_id":2,"event":"b","hostname":"bar"}
{"_id":3,"event":"c","hostname":"x"}
Tables without partitions
Tables created without partitions will have all records in a single partition, stored in the _default
partition directory.
Raijin creates the partition the first time you insert data into the table.
You can see this partition when you issue the SHOW PARTITIONS command for an unpartitioned table that contains data.
{
"name": "",
"state": "attached",
"record_count": 3,
"path": "testdatabase/testtable/_default",
"size": 80
}