TypeDB Fundamentals
The Type-Theoretic Language of TypeDB
TypeQL is the type-theoretic query language of TypeDB. One of the central design principles of the language is:
This allows for highly expressive querying power beyond the capabilities of other modern database paradigms. In particular, it allows for declarative polymorphic querying, which is not possible in non-polymorphic databases.
In another article, we explored TypeDB’s conceptual data model, and saw how it allows us to define type hierarchies and interfaces between them. In this article, we’ll see how we can leverage the model to write declarative polymorphic queries that do not need to be modified when the schema is extended. There are three types of polymorphism we’ll use in our queries:
- Inheritance polymorphism: Allows us to declaratively query data instances of all types in a hierarchy.
- Interface polymorphism: Allows us to declaratively query data instances of all types that implement an interface.
- Parametric polymorphism: Allows us to write declarative queries that are completely independent of the schema.
We’ll examine them individually, but these types of polymorphism can also be combined in queries to produce highly complex descriptions of data. The declarative nature of the queries shown will then be illustrated with an example of a schema extension, after which the queries do not need to be modified to return instances of the newly defined types.
This article uses an example data model for a simple filesystem based on a discretionary access control (DAC) permission system. In the DAC framework, all objects (for instance files, directories, and user groups) have owners, and permissions on an object are granted to other users by its owner. The model is described using TypeQL. Even without knowledge of the relevant keywords, it is easily understandable due to its near-natural language design. For the queries shown, long result sets are trimmed down for conciseness. To view the full result set for any query, the dataset and queries can be found on GitHub.
Querying with inheritance polymorphism
In this query, we ask for a list of users, retrieving the email (a string) and active status (a boolean) for each one.
match
$user isa user;
fetch
$user: email, active;
{
"user": {
"active": [ { "value": true, "value_type": "boolean", "type": { "label": "active", "root": "attribute" } } ],
"email": [ { "value": "cedric@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "admin", "root": "entity" }
}
}
{
"user": {
"active": [ { "value": true, "value_type": "boolean", "type": { "label": "active", "root": "attribute" } } ],
"email": [ { "value": "reginald@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
}
}
{
"user": {
"active": [ { "value": true, "value_type": "boolean", "type": { "label": "active", "root": "attribute" } } ],
"email": [ { "value": "tommy@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
}
}
{
"user": {
"active": [ { "value": false, "value_type": "boolean", "type": { "label": "active", "root": "attribute" } } ],
"email": [ { "value": "jimmy@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
}
}
{
"user": {
"active": [ { "value": true, "value_type": "boolean", "type": { "label": "active", "root": "attribute" } } ],
"email": [ { "value": "kima@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
}
}
{
"user": {
"active": [ { "value": true, "value_type": "boolean", "type": { "label": "active", "root": "attribute" } } ],
"email": [ { "value": "rhonda@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
}
}
Each result object represents a user, containing lists of any email
and active
attributes they have in addition to a field for the user’s type. Examining the type field for the results indicates that one of the users, Cedric, is an admin rather than a user. Despite only asking for users, TypeDB has returned an admin because admin
is a subtype of user
. This query leverages inheritance polymorphism: the ability to query a type and retrieve instances of that type and all of its subtypes. In this next query, we use inheritance polymorphism to get a list of all resources.
match
$resource isa resource;
fetch
$resource: id, event-timestamp;
{
"resource": {
"event-timestamp": [
{ "value": "2023-08-08T11:57:54.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-08-13T13:16:10.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-10-10T13:22:37.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-10-10T13:39:19.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-11-29T09:17:47.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-06-14T22:44:22.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/research/prototypes/nlp-query-generator.py", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [
{ "value": "2023-06-22T11:36:20.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-06-29T12:16:33.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-07-26T12:33:37.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-01-31T19:39:32.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/research/prototypes/root-cause-analyzer.py", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [
{ "value": "2023-02-15T09:13:55.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-03-25T12:11:15.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-06-02T06:17:54.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-12-04T23:07:09.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-01-28T09:16:24.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/engineering/tools/performance-profiler.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [
{ "value": "2023-11-23T07:05:18.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-10-01T01:46:23.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/engineering/projects/typedb-3.0", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [
{ "value": "2023-10-20T13:24:19.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-10-22T12:22:51.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-12-01T03:58:22.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-12-28T02:00:04.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-03-13T14:25:07.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
In this query, we are retrieving subtypes of resource
: files and directories. In fact, we will not get any results that are specifically of type resource
as the type is abstract. Inheritance polymorphism also affects the attributes returned. The IDs retrieved are paths because path
is a subtype of id
, and the event timestamps retrieved are creation and modification timestamps because created-timestamp
and modified-timestamp
are subtypes of event
timestamp. Here, the polymorphic constraints on the entity and its attributes have been combined to produce the result set, without having to use any special syntax to do so!
Querying with interface polymorphism
In TypeQL, an isa
statement is used to constrain the type of a variable. In this query, we ask for data instances that have creation timestamp attributes, without specifying the type of the instances we’re looking for. The data instances to retrieve are denoted by the generic $x
variable, which has no isa
constraint applied. As a result, the only constraint is that it must have a creation timestamp, specified by the has
statement.
match
$x has created-timestamp $created;
fetch
$x: id, created-timestamp;
{
"x": {
"created-timestamp": [ { "value": "2023-02-03T12:35:44.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "/vaticle/engineering/projects", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-01-31T19:39:32.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "/vaticle/research/prototypes/root-cause-analyzer.py", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-03-11T02:30:39.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "reginald@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-09-07T17:47:57.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "engineers", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
"type": { "label": "user-group", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-01-01T00:00:00.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "cedric@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "admin", "root": "entity" }
}
}
In this selection of results, we can see that five different types have been returned for $x
: directory
, file
, user
, user-group
, and admin
. This is because instances of all of those types have a creation timestamp. This query leverages interface polymorphism: the ability to query an interface and retrieve instances of the types that implement it. We can also see inheritance polymorphism at play in this query, as we have queried the id
type and retrieved results back of multiple subtypes. In the next query, we leverage interface polymorphism on a relation’s roles instead of an attribute ownership.
match
(resource: $resource, resource-owner: $owner) isa resource-ownership;
fetch
$resource: id;
$owner: id;
{
"owner": {
"id": [ { "value": "rhonda@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
},
"resource": {
"id": [ { "value": "/vaticle/research/prototypes/root-cause-analyzer.py", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"owner": {
"id": [ { "value": "kima@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "user", "root": "entity" }
},
"resource": {
"id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"owner": {
"id": [ { "value": "engineers", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
"type": { "label": "user-group", "root": "entity" }
},
"resource": {
"id": [ { "value": "/vaticle/engineering", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"owner": {
"id": [ { "value": "cedric@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "admin", "root": "entity" }
},
"resource": {
"id": [ { "value": "/vaticle", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"owner": {
"id": [ { "value": "cedric@vaticle.com", "value_type": "string", "type": { "label": "email", "root": "attribute" } } ],
"type": { "label": "admin", "root": "entity" }
},
"resource": {
"id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/traversal-engine.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
This query involves two interfaces: the resource
and resource-owner
roles of the resource-ownership
relation type. As there are two instantiable (non-abstract) types that implement resource
, and three that implement resource-owner
, there are six possible pairs of types that can be returned as answers:
resource | resource-owner |
---|---|
file | user |
file | admin |
file | user-group |
directory | user |
directory | admin |
directory | user-group |
In this selection of results, we can see five different pairs. Because of TypeDB’s declarative polymorphic querying, we can retrieve results without enumerating the possible return types in the query.
Querying with parametric polymorphism
The final type of polymorphism is parametric polymorphism. In this query, we’ve actually variablized the types $owner-type
and $attribute-type
. By making these types into parameters, we can perform a parametric query over them, and return their values in the response. This particular query searches for multivalued attributes of any type, and returns the exact type of those attributes and their owners. It uses isa!
statements, which are used to specify the exact types of variables without considering inheritance.
match
$owner has $attribute-1;
$owner has $attribute-2;
$owner isa! $owner-type;
$attribute-1 isa! $attribute-type;
$attribute-2 isa! $attribute-type;
not { $attribute-1 is $attribute-2; };
fetch
$owner-type;
$attribute-type;
{
"attribute-type": { "label": "modified-timestamp", "root": "attribute", "value_type": "datetime" },
"owner-type": { "label": "directory", "root": "entity" }
}
{
"attribute-type": { "label": "modified-timestamp", "root": "attribute", "value_type": "datetime" },
"owner-type": { "label": "file", "root": "entity" }
}
We can see from the results that the only multivalued attributes are of type modified-timestamp
are owned by instances of file
and directory
. The query does not involve a single explicitly specified type, which means the query can be executed against any database and will always return the types of multivalued attributes and their owners. This property is common to all parametric queries. Parametric queries are not possible in most database paradigms, as their query languages do not allow for variablization of types: tables and columns in relational databases, collections and fields in document databases, and labels and properties in graph databases.
Extending the schema
Now that we’ve seen how TypeQL enables construction of declarative polymorphic queries, we’ll see how this allows us to easy extend data models without refactoring queries. We’ll begin by adding a new repository
resource type to the schema, and a commit
relation relating a repository and an author. Repositories are identified by their name
attribute, and commits by a new hash
attribute.
define
repository sub resource,
owns name as id,
plays commit:repository;
commit sub relation,
relates repository,
relates author,
owns hash,
owns created-timestamp;
user plays commit:author;
hash sub id;
If we insert some data and then re-run two of the previous queries, we can see that the results of those queries have updated to reflect the newly defined types, and now include results relating to repositories and commits.
match
$resource isa resource;
fetch
$resource: id, event-timestamp;
{
"resource": {
"event-timestamp": [ { "value": "2023-05-20T19:12:11.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/query-parser.rs", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [
{ "value": "2023-11-23T07:05:18.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-10-01T01:46:23.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/engineering/projects/typedb-3.0", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [
{ "value": "2023-10-20T13:24:19.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-10-22T12:22:51.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-12-01T03:58:22.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-12-28T02:00:04.000", "value_type": "datetime", "type": { "label": "modified-timestamp", "root": "attribute" } },
{ "value": "2023-03-13T14:25:07.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } }
],
"id": [ { "value": "/vaticle/engineering/projects/typedb-cloud-beta", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [ { "value": "2023-05-07T11:30:32.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "typedb", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
"type": { "label": "repository", "root": "entity" }
}
}
{
"resource": {
"event-timestamp": [ { "value": "2023-06-05T14:13:23.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "typedb-cloud", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
"type": { "label": "repository", "root": "entity" }
}
}
match
$x has created-timestamp $created;
fetch
$x: id, created-timestamp;
{
"x": {
"created-timestamp": [ { "value": "2023-06-05T14:13:23.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "typedb-cloud", "value_type": "string", "type": { "label": "name", "root": "attribute" } } ],
"type": { "label": "repository", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-06-14T22:44:22.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "/vaticle/research/prototypes/nlp-query-generator.py", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-02-15T07:47:57.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "/vaticle", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "directory", "root": "entity" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-06-30T04:37:33.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "52b1142dabd766ec2e467bc74913de41", "value_type": "string", "type": { "label": "hash", "root": "attribute" } } ],
"type": { "label": "commit", "root": "relation" }
}
}
{
"x": {
"created-timestamp": [ { "value": "2023-06-28T17:00:14.000", "value_type": "datetime", "type": { "label": "created-timestamp", "root": "attribute" } } ],
"id": [ { "value": "/vaticle/engineering/projects/typedb-3.0/technical-specification.md", "value_type": "string", "type": { "label": "path", "root": "attribute" } } ],
"type": { "label": "file", "root": "entity" }
}
}
This is because polymorphic queries written in TypeQL are fully declarative. As long as we write our queries in such a way that accurately expresses the high-level question being asked, we do not have to update them when the schema is extended.
Summary
TypeDB allows us to write and resolve declarative polymorphic queries. When we leverage inheritance polymorphism, we do not have to enumerate the subtypes of the supertype being queried. Similarly for interface polymorphism, we do not have to enumerate the types that implement the interface being queried. And for parametric polymorphism, we do not have to enumerate the complete list of types in the schema. Because the types are never explicitly listed, queries do not have to be modified to include newly added types or exclude previously removed types. This is very different to other database paradigms where such enumeration would normally be necessary.