Configuration

There are two ways to configure TypeDB:

Command line arguments

Use command line arguments with the command typedb server to configure the server upon its launch. The command line arguments will override any relevant settings from the config file.

All arguments must:

  • start with the double dash prefix --,

  • be separated from their value (if any) by:

    • equals sign (=),

    • or whitespace.

Some arguments are exclusive to the command line interface:

  • --help: print out the help menu and exit.

  • --version: print out the version of the server and exit.

  • --config /path/to/external/typedb-config.yml: use the specified configuration file.

Additional arguments are derived from the configuration .yaml file. Any value from the configuration file can be overridden by a command line argument, as shown in the Configuration file options via command line arguments section.

Getting help

For all available commands via the command line use the --help argument to get a reference:

typedb server --help

This will list out how to use specific command line arguments as well as options derived from the configuration file.

Configuration file

TypeDB accepts a configuration (or config) file with a specific YAML format. See the sample configuration below.

This file usually has four top level sections:

For any change in the configuration file to take effect, restart the TypeDB server.

The default location of the config file

TypeDB ships with a default configuration file. The location of this file varies based on how TypeDB has been installed.

If downloaded manually, find the configuration file in the server/conf directory inside the unzipped folder.

If installed using Homebrew, check the following directory (replace the {version-number} placeholder with the exact version installed):

/usr/local/Cellar/typedb/{version-number}/libexec/server/conf/config.yml

If installed using APT:

/opt/typedb/core/server/conf/config.yml

Sample configuration

Here is a sample configuration file for TypeDB:

server:
  address: 0.0.0.0:1729
storage:
  data: server/data
  database-cache:
    # configure storage-layer data and index cache per database
    # it is recommended to keep these at equal sizes
    data: 500mb
    index: 500mb
log:
  output:
    stdout:   # note: this is a user-defined name
      type: stdout
    file:     # note: this is a user-defined name
      type: file
      base-dir: server/logs
      file-size-limit: 50mb
      archive-grouping: month
      archive-age-limit: 1 year
      archives-size-limit: 1gb
  logger:
    default:  # note: the default logger must be defined
      level: warn
      output: [ stdout, file ]
    typedb:   # note: this is a user-defined name
      filter: com.vaticle.typedb.core
      level: info
      output: [ stdout, file ]
    storage:
      filter: com.vaticle.typedb.core.database
      level: info
      output: [ stdout, file ]
  debugger:
    reasoner: # note: this is a user-defined name
      enable: false
      type: reasoner-tracer
      output: file
vaticle-factory:
  enable: false

Server

The server section of the configuration file contains network and RPC options.

  • address: the address to listen for TypeDB Clients connections via gRPC.

    Use the 0.0.0.0 IP address to listen to all connections and localhost for connections from the local machine.

Storage

The storage section of the configuration file contains the storage layer options.

  • data: the location to write user data to. Defaults to within the server distribution under server/data.

  • database-cache: per-database configuration for storage-level caching

    • data: cache for often-used data.

    • index: cache for data indexes.

For production use, it is recommended that the server.data is set to a path outside of the $TYPEDB_HOME (directory with TypeDB server files). This helps to make the process of upgrading TypeDB easier.

If the index cache is too small relative to the dataset, we may find severely degraded performance. We recommend allocating at least 2% of a database size equivalent to the index cache. For example, with 100 GB of on-disk data in a database, allocate at least 2 GB of index cache. Allocating more can improve performance.

Additionally, we recommend the sum of data and index caches equal to about 20% of the total memory of the server.

Log

The log section of the configuration file contains the logging options.

There are three subsections:

Output

output subsection defines destinations to write logs to.

  • User-defined output channel name

    • type — it’s either file or stdout.

    • base-dir (directory prior to version 2.21.0) — filepath, relative to the server binary. Only available for type: file.

    • file-size-limit (file-size-cap prior to version 2.21.0)  — maximum size of a log file. If the log file reaches the limit, a new file in the same directory will be started. This is similar to the maxsize config option in logrotate. Only available for type: file.

    • archive-grouping — configures the rollover and naming policy of archives produced by the logger.
      Possible value variants are as follows:

      • minute or minutes

      • hour or hours

      • day or days

      • week or weeks

      • month or months

      • year or years

  • archive-age-limit — configures how long archive files are kept.
    Old archives are only deleted when new ones are produced.
    If the value is set to zero, then the age is unlimited (old logs are not deleted). Otherwise, the value should be a positive integer, followed by a whitespace and one of the following values for units:

    • minute or minutes

    • hour or hours

    • day or days

    • week or weeks

    • month or months

    • year or years

  • archives-size-limit (archives-size-cap prior to version 2.21.0) — maximum size of all log files. If the total size of all log files in the directory reaches the limit, the oldest one gets removed. Only available for type: file.
    If the value is set to zero, then the total size is unlimited (older logs are not deleted to preserve total size limitation of the log archives). Otherwise, the value should be a positive integer, followed by one of the following values for units:

    • kb

    • mb

    • gb

Logger

logger subsection configures logging for modules in TypeDB, along with a log level and output targets (referencing outputs by name defined under the outputs section).

  • output — destination of the log output. Input format is a list of output channels, each of which must be defined in the output subsection.

  • level — verbosity level.
    One of the following values can be used:

    • warn

    • info

    • debug

On debug level the server will periodically log database storage properties.

Debugger

debugger subsection configures TypeDB-specific debuggers.

Possible values for the type field are the following:

  • reasoner-tracer

  • reasoner-perf-counters

Configuration file options via command line arguments

Use command line arguments to override any option in the configuration file.

For example, the configuration file sets the server address as the following:

server:
  address: 0.0.0.0:1729

If we want to use port 1730 instead of 1729, we can either update the configuration file or override it from the command line using the following command:

typedb server --server.address 0.0.0.0:1730

Use the same approach to set a completely new section of the configuration that isn’t present in the file yet. For example, to define a new logger subsection to print out all query plans, we could do the following to set the package com.vaticle.typedb.core.traversal to output on a more verbose level:

typedb server  \
  --server.address 0.0.0.0:1730  \
  --log.logger.traversal.filter com.vaticle.typedb.core.traversal  \
  --log.logger.traversal.level debug \
  --log.logger.traversal.output "[ file, stdout ]"

Cluster configuration

Every server in a cluster has its own config file that contains a list of known servers in the cluster. A server in a cluster will not accept connections from servers that are not on the list.

Changes to the server configuration require a server restart to take effect.

Host machine requirements

The minimum host machine configuration for running a single TypeDB database is 4 (v)CPUs, 10 GB memory, with SSD.

The recommended starting configuration is 8 (v)CPUs, 16 GB memory, and SSD. Bulk loading is scaled effectively by adding more CPU cores.

The following is the breakdown of TypeDB memory requirements:

  • the JVM memory: is configurable when booting the server with JAVAOPTS="-Xmx4g" typedb server. This gives the JVM 4 GB of memory. Defaults to 25% of system memory on most machines.

  • storage-layer baseline consumption: approximately 2 GB.

  • storage-layer caches: this is about 2x cache size per database. If the data and index caches sum up to 1 GB, the memory requirement is 2 GB in working memory.

  • memory per CPU: approximately 0.5 GB additional per (v)CPU under full load.

We can estimate the amount of memory the server will need to run a single database with the following equation:

required memory = JVM memory + 2 GB + (2 × configured db-caches in GB) + (0.5 GB × CPUs)

For example, on a 4 CPU machine, with the default 1 GB of per-database storage caches, and the JVM using 4 GB of RAM, the default requirement for memory would be: 4 GB + 2 GB + (2 × 1 GB) + (0.5 GB × 4) = 10 GB.

Each additional database will consume an additional amount at least equal to the cache requirements (in this example, an additional 2 GB of memory for each database).

Open file limit

To support large data volumes, it is important to check the open file limit the operating system imposes. Some Unix distributions default to 1024 open file descriptors. This can be checked with the following command:

ulimit -n

We recommend this is increased to at least 50 000.