New ACM paper, free-tier cloud, and open-source license

TypeDB Fundamentals

The Symbolic Reasoning Engine of TypeDB


The conceptual model, type-theoretic language, and strong type system of TypeDB come together to enable symbolic reasoning. In TypeDB, reasoning takes the form of rule inference, which allows the database to generate new facts based on existing data and user-defined rules. Resolution of rules is managed by the rule-inference engine. The data created by rules is generated at query-time and held in memory. This means that they will always reflect the most recent state of the schema and data when the query is run, so we don’t have to worry about stale or inconsistent data, and disk space is saved. Once we close the transaction, the rule-inference cache is cleared and the resources are released.

Rule inference has a number of powerful use cases, ranging from creating convenient abstractions for patterns used across multiple queries, to capturing complex business logic from the data domain. In this article, we’ll examine some of the ways we can use rule inference to simplify and enhance database operations. We’ll begin with an example of a single rule, then see how we can combine them in powerful ways.

This article uses an example data model and queries for a simple filesystem based on a discretionary access control (DAC) permission system. In the DAC framework, all objects (for instance files, directories, and user groups) have owners, and permissions on an object are granted to other users by its owner. The schema, dataset, and queries can all be found on GitHub.

Simple rules

In the following query, we retrieve the most recent modification timestamp for each resource without using rule inference.

match
$resource isa resource, has modified-timestamp $last-modified;
not {
    $resource has modified-timestamp $other-modified;
    $other-modified > $last-modified;
};
fetch
$resource: id;
$last-modified;
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/traversal-engine.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-09T14:36:26.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta/user-guide.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-06T07:15:09.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/technical-specification.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-11-24T14:06:36.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/release-notes.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-17T22:31:07.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta/user-manager.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-12T03:45:35.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } }
}

It’s not a complex query, but checking for later modification timestamps is a bit cumbersome. We’ll likely need to query last modification timestamps as part of a lot of larger queries, so this query pattern will need to be repeated many times across our codebase. Rather than searching through the modification timestamps each time, it would be better if we store the most recent timestamp in a way we can access it more conveniently. We could separately insert a special last-modified timestamp on every resource, but then we’d need to manually update it whenever we add a new modified-timestamp. This could lead to inconsistent data if we’re not careful. This situation is a perfect use case for rule-inference, which can ensure our data stays consistent. To begin with, we go ahead and define the new attribute with a Define query.

define

last-modified sub attribute, value datetime;
resource owns last-modified;

Next, we write a rule using the pattern from our original query and add it to our database, also with a Define query. Like types, rules are a core part of the schema.

define

rule resource-last-modified:
    when {
        # The original query pattern
        $resource isa resource, has modified-timestamp $last-modified;
        not {
            $resource has modified-timestamp $other-modified;
            $other-modified > $last-modified;
        };
        # Copy the value of $last-modified
        ?timestamp = $last-modified;
    } then {
        # Generate a new attribute with the same value
        $resource has last-modified ?timestamp;
    };

Rules use the same pattern syntax as queries, so they can make use of polymorphic patterns. In this rule, we use inheritance polymorphism to assign last-modified timstamps to all resources via their supertype resource. Rules also undergo semantic validation to ensure the integrity of generated data. With the rule defined, we can replace our original query for the most recent modified-timestamp with a simple query for last-modified. The results are identical to those before, except that that type field of $last-modified has changed from modified-timestamp to last-modified.

match
$resource isa resource, has last-modified $last-modified;
fetch
$resource: id;
$last-modified;
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/traversal-engine.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-09T14:36:26.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta/user-guide.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-06T07:15:09.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/technical-specification.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-11-24T14:06:36.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/release-notes.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-17T22:31:07.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}
{
    "resource": {
        "id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta/user-manager.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
        "type": { "label": "file", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-12T03:45:35.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}

The new attribute type can be used in any other query too, so we don’t have to repeat the full pattern from the original query anywhere outside of the rule. This also means that if we want to change how the last modification timestamp is determined, we only have to change it in the rule, which serves as a single source of truth.

Rule branching

When used together, multiple rules can emulate very powerful logic, consistent with the way we naturally think about data. In the following rule, we check for resources that don’t have any modification timestamps, and generate last-modified attributes equal to the creation timestamps instead. This generates the same conclusion as the previous rule, so the two rules represent parallel branches of logic. As their conditions are mutually exclusive, only one last-modified attribute will be generated per resource.

define

rule implicit-resource-last-modified:
    when {
        $resource isa resource, has created-timestamp $created;
        not { $resource has modified-timestamp $modified; };
        ?timestamp = $created;
    } then {
        $resource has last-modified ?timestamp;
    };

When we query for last modification timestamps, we can continue to use the same query as before. We do not have to specify which rules to use, as the rule-inference engine will find all applicable rules in the schema and apply them in parallel.

match
$resource isa resource, has last-modified $last-modified;
fetch
$resource: id;
$last-modified;

Rule chaining

In addition to using branches of logic, rules can also be combined to form chains of logic. In the next rule, we generate modification timestamps for repositories equal to the creation timestamps of the commits on those repositories.

define

rule repository-modified-timestamps:
    when {
        $repository isa repository;
        (repository: $repository) isa commit, has created-timestamp $commit-created;
        ?timestamp = $commit-created;
    } then {
        $repository has modified-timestamp ?timestamp;
    };

The rule-inference engine uses a backward-chaining method to resolve rules. This rule generates a modification timestamp, and both the previous rules resource-last-modified and implicit-resource-last-modified search for modification timestamps. As a result, the modified-timestamp instances that it generates can be used to generate last-modified instances! Again, it is not necessary to specify which rules to use. The rule-inference engine will trigger any applicable rules in the correct order. If we query for the last modified timestamps of repositories, we now get the timestamps of the most recent commits on them.

match
$repository isa repository, has last-modified $last-modified;
fetch
$repository: id;
$last-modified;
{
    "repository": {
        "id": [ { "value": "typedb-cloud", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
        "type": { "label": "repository", "root": "entity" }
    },
    "last-modified": { "value": "2023-12-13T01:05:03.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}
{
    "repository": {
        "id": [ { "value": "typedb", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
        "type": { "label": "repository", "root": "entity" }
    },
    "last-modified": { "value": "2023-08-19T02:08:58.000", "value_type": "datetime", "type": { "label": "last-modified", "root": "attribute" } }
}

Rule recursion

Finally, we’ll examine a special case of rule chaining that allows for recursive logic. For this example, we’ll move away from modification timestamps and focus on another part of the data model. Let’s consider the file type-theory.tex, and query the directories it is in.

match
$paper isa file, has path "/vaticle/research/papers/type-theory.tex";
$directory isa directory, has path $path;
(directory: $directory, directory-member: $paper) isa directory-membership;
fetch
$path;
{ "path": { "value": "/vaticle/research/papers", "value_type": "string", "type": { "label": "path", "root": "attribute" } } }

We only get one result, though logically the file is also in the directories /vaticle/research and /vaticle. We can see this by querying further up the directory structure.

match
$papers isa directory, has path "/vaticle/research/papers";
$directory isa directory, has path $path;
(directory: $directory, directory-member: $papers) isa directory-membership;
fetch
$path;
{ "path": { "value": "/vaticle/research", "value_type": "string", "type": { "label": "path", "root": "attribute" } } }

match
$research isa directory, has path "/vaticle/research";
$directory isa directory, has path $path;
(directory: $directory, directory-member: $research) isa directory-membership;
fetch
$path;
{ "path": { "value": "/vaticle", "value_type": "string", "type": { "label": "path", "root": "attribute" } } }

In application code, this would be an excellent use case for recursion, which would allow us to write a function that easily retrieves all of a file’s parent directories at once. We can do the same with rule chaining, by defining a rule that can trigger itself.

define

indirect-directory-membership sub directory-membership;

rule transitive-directory-memberships:
    when {
        (directory: $directory-1, directory-member: $directory-2) isa directory-membership;
        (directory: $directory-2, directory-member: $object) isa directory-membership;
    } then {
        (directory: $directory-1, directory-member: $object) isa indirect-directory-membership;
    };

When this rule is triggered, the rule-inference engine will search for instances of directory-membership between $directory-1 and $directory-2, and between $directory-2 and $object. As this rule can generate instances of indirect-directory-membership (which is a subtype of directory-membership) between those roleplayers, the rule will be triggered again! If we run the query again after adding this rule to the schema, we can see that the expected directory tree is correctly returned.

match
$paper isa file, has path "/vaticle/research/papers/type-theory.tex";
$directory isa directory, has path $path;
(directory: $directory, directory-member: $paper) isa directory-membership;
fetch
$path;
{ "path": { "value": "/vaticle/research/papers", "value_type": "string", "type": { "label": "path", "root": "attribute" } } }
{ "path": { "value": "/vaticle/research", "value_type": "string", "type": { "label": "path", "root": "attribute" } } }
{ "path": { "value": "/vaticle", "value_type": "string", "type": { "label": "path", "root": "attribute" } } }

While the rule doesn’t define an explicit base case, it terminates because the rule-inference engine will only generate each given data instance once. This means that rule inference is always guaranteed to terminate except in a few niche cases, such as using a recursive rule involving arithmetic to generate numeric attributes in an unbounded sequence. These cases do not arise in practical database design, but are a consequence of TypeQL’s Turing completeness.

Summary

These rules demonstrate the basic ways in which rules can be used, but they only scratch the surface of what is possible with rule inference. Some additional rules are included in the sample code for the interested reader.

Share this article

TypeDB Newsletter

Stay up to date with the latest TypeDB announcements and events.

Subscribe to Newsletter

Further Learning

TypeDB's Core Features

Explore TypeDB's core features in detail, including validated polymorphic data models and declarative querying, and learn about their impact on database engineering.

Read article

Inference in TypeDB

If a rule represents deductive logic, rule inference is the process of inferring new information according to the deductive logic encoded in rules.

Read article

TypeDB's Data Model

Learn about the conceptual Polymorphic Entity-Relation-Attribute model that backs TypeDB and TypeQL, and how it subsumes and surpasses previous database models.

Read article

Feedback