Configuration parameters

This chapter shows how to configure your Raijin installation.

raijin.conf

By default, the system reads its configuration from the /opt/raijin/conf/raijin.conf file which contains several configuration groups.

Logging settings

PanicMode

Specifies the way of handling critical errors. The available values are as follows:

  • HARD logs an error and stops the server

  • SOFT logs and throws an error

  • NONE logs and continues execution

LogFile

Specifies the file to write Raijin logs to.

LogToSTDERR

Specifies if the server should duplicate error messages to STDERR.

In daemon mode, the server has no controlling terminal, so the parameter is ignored.

LogQueries

Specifies whether to log queries on the INFO level. By default, queries are logged on the DEBUG level.

LogFileMaxSize

Maximum size of one log file. The minimum value is 1 Mb. The default value is set to 1800 Mb. If file size exceeds LogFileMaxSize, a new file is created.

LogLevels

Defines severity levels for various subsystems. The available values are DEBUG, INFO, WARNING, ERROR, CRITICAL.

These log levels are applicable to the following subsystems:

  • CommonLogLevel

  • StorageLogLevel

  • ParserLogLevel

  • OptimizerLogLevel

  • ExecutorLogLevel

  • AccessLogLevel

  • ServerLogLevel

  • GlobalLogLevel

If the XXXLogLevel is set, it uses its own value. Otherwise, it uses the value of the GlobalLogLevel.

Networking settings

ListenAddr

The address the server accepts connections on.

Port

The port the server accepts connections on.

WebDir

The location of web front-end available at http://LocalAddr:Port.

Storage settings

MemTableSize

Specifies the threshold for the memory table which stores new records. When the threshold is exceeded, the records are dumped to a file on disc. The size is measured in memory units.

DataFileSize

Specifies the threshold for the file containing table records. When the threshold is exceeded, a new file is created to store the new records. The size is measured in memory units.

ChunkSize

When storing column data on disc, the data is divided into several chunks. This parameter specifies the maximum number of items that a chunk can hold. Values less than or equal to 4096 are recommended. Values greater than 4096 can negatively affect server performance.

DescriptorFileCacheMaxSize

The maximum size in bytes of the descriptor file cache. This specifies the overall maximum; the cache contains a number of shards, and the maximum for each shard is calculated by dividing the DescriptorFileCacheMaxSize value by the number of shards. The number of shards is equal to the number of hardware threads that the system supports.

DataDir

The root data directory where Raijin stores databases. By default, it is set to /opt/raijin/data/.

Sorting and grouping settings

SortMemoryLimit

The maximum amount of memory to be used by a single Order By or Group By operation.

TmpDir

The directory to store intermediate results. If not set, the system temporary directory is used.

If not specified manually, the directory is cleaned from files unexpectedly left behind at each server start.

Conversion settings

IndexTypeConversion and HeapTypeConversion

Force data type conversion for index columns and for heap columns (if schema is known in advance) respectively.

IndexTypeConversionRaiseError and HeapTypeConversionRaiseError

If the given value can not be converted to the expected data type, an error is thrown if set to TRUE or ignored with the FALSE setting.

JsonAutoConvertType

If set to TRUE, JSON-formatted values are converted to primitives.

TransformNullEquals

If set to TRUE, equivalence checks (=, !=, <, >) work for NULL values as well.

Floating point representation settings

FloatPrecision

Sets the number of digits printed after the right of decimal point for float/double values. The default value is 6. The maximum value for DOUBLE is 17, for FLOAT this is 9. As server starts, this value is checked for the DOUBLE maximum only, but using values greater than 9 for FLOAT will abort the query execution with the runtime error.

Execution settings

PidFile

The location of the pid file used to stop the server in daemon mode. By default, it is set to /opt/raijin/run/.

User and Group

If not set, the current user and group are used. Otherwise, it uses the specified user:group and, if necessary, narrows privileges to it.

After the default installation, the raijin user and group are created and set as the owner of all server data. The User and Group parameters are also set to raijin.

Threads and IOThreads

Specifies the number of working threads and asynchronous I/O threads respectively. If not set, the default values are deduced at runtime from the underlying hardware.

StackSize

Specifies the maximum size of each working thread. If not set, the system default is used.

StackOverflowLimit

The stack size limit at which the process execution will be aborted. The value of this parameter should indicate the minimum size of free memory in the stack. This free memory is needed for stack unwinding while generating the backtrace and correct termination of the process. The default value is 15Kb.

IdColumnName

Name of the internal row id column in the Raijin output. This is a session parameter.

Operators policy settings

Raijin has the following types of operator behavior:

  • NULLING means that before calculating operator result its inputs are checked for irrelevant values. In case of incorrect values, they will be nulled

  • QUIET implies that no checks are performed and no errors are reported in case of invalid data presence

  • SIGNALING type of behavior means that before calculating operator result its inputs are checked for irrelevant values. In case of incorrect values, the process of calculation is aborted with an error message

Not all operators support policy behavior.

Below is the list of operators which support policy behavior.

OperatorPolicyGlobal

Sets a global policy. If the operator policy is set, it uses its own value. Otherwise, it uses this operator value. If the operator does not support policy that is set in OperatorPolicyGlobal then operator’s default is used.

OperatorDividePolicy

Division operator (a / b) policy. The supported values are DEFAULT, QUIET, NULLING, and SIGNALING. The default value is QUIET.

OperatorModulusPolicy

Modulus operator (a % b; mod(a, b)) policy. The supported values are DEFAULT, NULLING, and SIGNALING. The default value is NULLING.

OperatorLnPolicy

Ln operator (Ln(a); Log(a)) policy. The supported values are DEFAULT, QUIET, and NULLING. The default value is QUIET.

OperatorLogPolicy

Log operator (Log(a, b)) policy. The supported values are DEFAULT, QUIET, and NULLING. The default value is QUIET.

OperatorLog10Policy

Log10 operator (Log10(a)) policy. The supported values are DEFAULT, QUIET, and NULLING. The default value is QUIET.

OperatorLog2Policy

Log2 operator (Log2(a)) policy. The supported values are DEFAULT, QUIET, and NULLING. The default value is QUIET.

OperatorSqrtPolicy

Sqrt operator (Sqrt(a)) policy. The supported values are DEFAULT, QUIET, NULLING, and `SIGNALLING. The default value is QUIET.

OperatorPowPolicy

Power operator(pow(a, b); power(a, b)) policy. The supported values are DEFAULT, QUIET, NULLING, and SIGNALLING. The default value is QUIET.

Optimizer cost constants

The cost variables described in this section are measured on an arbitrary scale. Only their relative values matter, hence scaling them all up or down by the same factor will result in no change in the optimizer choices. By default, these cost variables are based on the cost of sequential row groups fetches; that is, OptimizerSeqRowGroupsCost is conventionally set to 1.0 and the other cost variables are set with reference to that. But you can use a different scale if you prefer, such as actual execution times in milliseconds on a particular machine.

Unfortunately, there is no well-defined method for determining ideal values for the cost variables. They are best treated as averages over the entire mix of queries that a particular installation will receive. This means that changing them on the basis of just a few experiments is very risky.

OptimizerSeqRowGroupsCost

Sets the optimizer’s estimate of the cost of a table row group fetch that is part of a series of sequential fetches. The default value is 1.0.

OptimizerCpuOperatorCost

Sets the optimizer’s estimate of the cost of processing each operator or function executed during a query. The default value is 0.0025.

OptimizerCpuTupleCost

Sets the optimizer’s estimate of the cost of processing each row during a query. The default is 0.01.

Other optimizer options

OptimizerConstraintExclusion

Controls the query optimizer’s use of table constraints to optimize queries. By default, it is set to TRUE.

When this parameter allows it for a particular table, the optimizer compares query conditions with the table CHECK constraints, and omits scanning tables for which the conditions contradict the constraints.

Also checking for self-contradictory query conditions is performed.

For example:

CREATE TABLE tbl (a bool, ...);
...
SELECT * FROM tbl WHERE a AND not a;

Here, the a AND not a expression is self-contradictory and will fail the WHERE condition for each row of the tbl table. With constraint exclusion enabled, this SELECT will not scan tbl at all thus improving performance.

Turning this option on could impose extra planning overhead that could be quite noticeable even on simple queries. If you have problems with it you might prefer turning the option off entirely.

OrderbyWithLimitOptimization

Enables and disables optimization of queries with the ORDER BY..LIMIT:

SELECT * FROM t1 ORDER BY <field list> LIMIT n [OFFSET m]

The default value is TRUE.

While the default algorithm performs sorting of the whole dataset first, involving using on-disc cache for data parts, the optimized algorithm performs the in-memory operations which involve using memory enough for storing LIMIT+OFFSET rows only, without on-disc cache. This significantly improves the algorithm speed, even in comparison with algorithm in cases when no disc cache used.

The behavior of this optimization is also defined by the SortMemoryLimit parameter. For the OFFSET+LIMIT sum, there may be insufficient memory to perform the ORDER BY operation. In this case, try to increase the value of the SortMemoryLimit parameter or turn this optimization off.

Parser settings

BackslashEscaping

Configures the backslash (C-style) escape using in string constants and delimited identifiers. By default, it is set to TRUE.

For syntax details, see the Backslash (C-style) escaping section.

Settable configuration settings

Several parameters can be specified via the SET command. The changes work only for the current session (connection) and are invisible for other users. Currently, only Conversion settings, IdColumnName, and FloatPrecision settings are supported.

Configuring via command line

The -c(--config) command line option lets you specify custom config file location.

The -p(--parameter) lets you specify any config parameter manually in command line. For example:

raijin-server -p GlobalLogLevel=WARNING -p Port=2501

starts the server on port 2501 and logs only messages with severity >= WARNING.