High availability
Overview
High availability features are available only in TypeDB Enterprise & TypeDB Cloud only. |
TypeDB Enterprise & TypeDB Cloud support the following high availability features:
-
Server clustering,
-
Data and schema replication with primary/secondary replicas,
-
Primary replica failover using Raft consensus algorithm,
-
Synchronous replication using Raft consensus algorithm,
-
Client level load balancing.
Replication
TypeDB replication is performed at the database level. Specifically, there is a separate replication group for each database. There can be multiple replicas of a database, but only one per server.
There are two types of replicas: primary replica and secondary ones (or leader and followers in Raft terms). There can be only one primary replica of a database at any given time. It is possible that primary replicas of different databases can be on the same or different servers. Primary replicas location can’t be set manually, only through the internal election process (as per Raft algorithm). Leader elections should happen relatively rarely in a stable network.
By default, all write and read operations are performed on the primary replica. Reads from secondary replicas have no guarantee of up-to-date data. But reads can be very costly (especially using the inference option). The "read any replica" transaction option allows Clients to read data from the primary or secondary replicas with automatic load balancing. Client will choose a random replica to balance the load between replicas.
TypeDB Enterprise & TypeDB Cloud use ZeroMQ for replication and gRPC for replication management. |
Primary replica failover
All replicas start as secondary replica. The only way for a replica to become a primary is to win an election. Election is won by a replica which is voted by the majority of replicas in a cluster.
Any secondary that didn’t hear from a primary for a set time-out period plus/minus random margin will start an election. Every replica will participate in only one election at the same time.
The candidate calls for an election with a special message, that allows all other replicas to check whether the candidate meets all the requirements, e.g., have the latest replication log records and have information from the previous election. Only if all requirements are met, a replica will cast its vote for the candidate. |
This is an internal process and should not be visible to users and applications, as TypeDB Clients should be able to deal with that transparently (apart from possible query response error or transaction closure due to the cause of the failover/election event).
For more information on Raft consensus algorithm please read the Raft algorithm documentation.
Data persistence
TypeDB uses RocksDB for data persistence.
RocksDB is a durable storage layer with an internal write-ahead log (WAL). The data is stored as files on a filesystem. WAL is used to restore data that didn’t get to the main storage of a server due to a sudden crash or a power loss.
There are two main sorts of data being persisted:
-
Database schema and data instances.
-
Raft replication log.
Both are persisted in directories set in the configuration file. Those can be separate directories or the same one. The paths to the storage directories can be overridden by command line arguments.
Load balancing
Load balancing can be done by any TypeDB Client for read type transactions with the read_any_replica
option set to
True
.
When instantiated, client usually sets up with a list of all replica addresses. Additionally, it can use auto-discovery function to discover missing replicas. It’s done by obtaining a list of known servers in the cluster from a server TypeDB Client was able to connect to.