Backup and archiving
Raijin does not currently include a specialized tool for backing up databases. However, you can use OS-level tools to back up the Raijin Database Engine server. In addition, you can implement a data archiving solution by partitioning tables according to your requirements.
File permissions
Regardless of how you back up your databases, Raijin must be able to read and modify restored databases and partitions. Therefore, you must preserve user permissions when copying or moving files. In Unix-like operating systems, you can use the cp (copy) command with the Alternatively, use the following commands to re-assign the
|
File system snapshots
We recommend regular, full file system snapshots for large, active databases. The easiest way to implement this is by using OS-level tools to take a snapshot of the volume containing the Raijin data directory. This way, the full state of the database is saved, and in case of a catastrophic event, you can quickly restore service using the most recent snapshot.
Prerequisites:
-
During deployment, you must place the database on a volume that supports snapshots, for example, Btrfs or LVM. You can change the default Raijin data directory with the
DataDir
configuration parameter. See Storage settings for more information. -
It is important that you stop the Raijin Database Engine service before taking the snapshot to ensure data consistency.
The first step is to lock the database to prevent further write transactions. The FLUSH TABLES WITH READ LOCK command locks all tables in all databases.
$ curl -X POST http://localhost:2500 -H 'Content-type: application/json' -d '{"query": "FLUSH TABLES WITH READ LOCK;"}'
Take a snapshot of the volume where the Raijin data directory resides and release the lock.
$ curl -X POST http://localhost:2500 -H 'Content-type: application/json' -d '{"query": "UNLOCK TABLES;"}'
Finally, back up the volume snapshot to a safe location, ideally to an external storage device or service.
Partition-based archiving
Table partitioning makes it easy to implement a data archiving strategy.
Archiving table partitions
Using Raijin Database Engine’s table partitioning, you can partition a table dynamically based on the event date or any combination of columns. Dynamic partitioning enables you to archive data based on your requirements, for example, archiving events older than 60 days.
In this example, the DATE_TRUNC function extracts the date portion of EventTime
.
DATE_TRUNC(EventTime)
is supplied as the argument to PARTITION BY
when creating the table.
Therefore, the date part of the EventTime
field will determine the partition of each record.
CREATE DATABASE nxlog;
CREATE TABLE tbl() PARTITION BY DATE_TRUNC('day',EventTime);
INSERT INTO tbl(event, EventTime)
VALUES('a', '2023-04-01 17:14:04'); (1)
INSERT INTO tbl(event, EventTime)
VALUES('b', '2023-03-04 12:03:10'); (2)
INSERT INTO tbl(event, EventTime)
VALUES('c', '2023-02-16 16:51:27'); (3)
1 | Dynamically assigned to partition 2023-04-01T01:00:00.000000+01:00 |
2 | Dynamically assigned to partition 2023-03-04T01:00:00.000000+01:00 |
3 | Dynamically assigned to partition 2023-02-16T01:00:00.000000+01:00 |
SHOW PARTITIONS IN tbl;
{"name":"2023-04-01T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/db3731e2c6e70b6cfa20ffa449e49fcf"}
{"name":"2023-03-04T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/58f2f0116b18375577bedaf0940d3f8f"}
{"name":"2023-02-16T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/7e1c88563ba94aaeafd2633c82ff4908"}
ALTER TABLE tbl DETACH PARTITION "2023-02-16T01:00:00.000000+01:00";
SHOW PARTITIONS IN tbl;
{"name":"2023-04-01T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/db3731e2c6e70b6cfa20ffa449e49fcf"}
{"name":"2023-03-04T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/58f2f0116b18375577bedaf0940d3f8f"}
{"name":"2023-02-16T01:00:00.000000+01:00","state":"detached","rows":"1","path":"nxlog/tbl/_detached/7e1c88563ba94aaeafd2633c82ff4908"}
After detaching the partition, you can back up and delete its data folder.
$ sudo cp -Ra /opt/raijin/data/db/nxlog/tbl/_detached/7e1c88563ba94aaeafd2633c82ff4908 /tmp/raijin
$ sudo rm -R /opt/raijin/data/db/nxlog/tbl/_detached/7e1c88563ba94aaeafd2633c82ff4908
You can automate the above sequence by a script executed daily, for example, via a cron job.
The following is an example bash script that detaches partitions created based on the event date.
The script connects to a local Raijin instance and uses database nxlog
and table tbl
.
It detaches partitions older than two months, copies the partition folder to a backup location, and deletes the source folder.
#!/bin/bash
command -v curl >/dev/null || echo "curl not installed"
command -v jq >/dev/null || echo "jq not installed"
#CUTOFF=$(date --date "30 days ago" "+%s")
CUTOFF=$(date --date "2 months ago" "+%s")
RAIJIN=0.0.0.0:2500
DB=nxlog
TBL=tbl
DATA="/opt/raijin/data/db/$DB/$TBL/_detached"
BACKUP="/tmp/raijin"
RQ=$(mktemp)
LOG=$(mktemp)
PT=$(mktemp)
cmdgen(){
echo -n '{"query":"use '"$DB"';'
curl -sS $RAIJIN --data \
'{"query":"use '"$DB"'; show partitions from '"$TBL"'"}' |\
jq -r '(.name) + "$" + (.state) + "$" + (.rows) + "$" + (.path)' |\
while IFS="$" read DATE STATE ROWS FPATH; do
export PATH=/usr/bin:/bin:$PATH
if [ $(date --date "$DATE" "+%s") -lt $CUTOFF ] && [ $STATE = 'attached' ]; then
IFS="/" read -ra arr <<< "$FPATH"
echo "${arr[-1]}" >> $PT
echo -n "alter table $TBL detach partition \\\"$DATE\\\";"
echo "Detaching partition $DATE" | tee -a $LOG | systemd-cat -t raijin.cron
fi
done
echo -n \"}
if [ ! -s $LOG ]; then
echo "Nothing to do today" | tee -a $LOG | systemd-cat -t raijin.cron
fi
}
cmdgen > $RQ
curl -sS $RAIJIN -d "@$RQ"
while read PARTITION; do
SRC="$DATA/$PARTITION"
DST="$BACKUP/$PARTITION"
echo "Copying $SRC to $DST" | tee -a $LOG | systemd-cat -t raijin.cron
mkdir -p $BACKUP && cp -Ra $SRC $DST
echo "Deleting $SRC" | tee -a $LOG | systemd-cat -t raijin.cron
rm -R $SRC
done < $PT
cat $LOG
rm $RQ $LOG $PT
This script is provided as is without warranty of any kind, either expressed or implied. Use at your own risk. |
Restoring table partitions
Suppose you need to execute queries on archived data, for example, for forensic analysis.
In that case, you can quickly restore the relevant partition(s) by copying the data back to the table’s _detached
directory and re-attaching it.
$ sudo cp -Ra /tmp/raijin/7e1c88563ba94aaeafd2633c82ff4908 /opt/raijin/data/db/nxlog/tbl/_detached
ALTER TABLE tbl ATTACH PARTITION "2023-02-16T01:00:00.000000+01:00";
SHOW PARTITIONS IN tbl;
{"name":"2023-04-01T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/db3731e2c6e70b6cfa20ffa449e49fcf"}
{"name":"2023-03-04T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/58f2f0116b18375577bedaf0940d3f8f"}
{"name":"2023-02-16T01:00:00.000000+01:00","state":"attached","rows":"1","path":"nxlog/tbl/7e1c88563ba94aaeafd2633c82ff4908"}