TypeDB Fundamentals
Constraint language for advanced data modeling
In this article we discuss the constraint language of TypeDB’s functional database programming model, including constraints that can be placed on subtype terms, type sizes, and type dependencies.
At a glance: syntax essentials
In a nutshell, the constraint language of TypeDB 3.0 will include the following new syntactic constructs:
@values(v1, v2, ...)
restricts a given value type to specific valuesv1
,v2
, … . For example, the definitioncolor value string @values("red", "blue", "green");
constrains thecolor
attribute to only have values"red"
,"blue"
, or"green"
.@card(n..m)
can be used when relating role players or owning attributes: it ensures the cardinality of a relation’s roles or an owner’s attributes lies in the rangen, ..., m
. Not that the upper boundm
may be omitted.@cascade
is used for marking relation types as dependent on their role players and enabling cascading deletes. There is also an query-level version of the annotation which allows to overwrite schema behavior.@independent
is used for marking attribute types independent from their owners and preventing cascading deletes (i.e. we keep attributes in the database even when their owners get deleted).- The
update
clause is particularly useful when updating data of cardinality at most 1, as we’ve seen the example above. @distinct
ensures distinctness of elements for role-player or attribute lists. For example,acyclic_path relates node[] @distinct
ensures noacyclic_path
contains a node twice.- Further provide to
@key
and@unique
, we provide an annotation@subkey(LABEL)
which makes an owned attribute part of aLABEL
-named joint key across (potentially) multiple attributes.
Now, let’s see some examples!
Enums via value restrictions
The new @values
annotation constrains attribute types to only have instances with certain values. For example, consider the following schema-level definition:
define status sub attribute, value string @values("SUCCESS", "FAILURE");
This defines a status
attribute type of string value type, but constrains all its attribute instance to be either the string "SUCCESS"
or the string "FAILURE"
. Thus, when we try running the following query
insert $server isa server, has status "HAPPY";
TypeDB will return us an error, since "HAPPY"
is not among the specified values "SUCCESS", "FAILURE"
.
Cardinality
Cardinality constrains the number of elements in our types: the annotation @card(n..m)
says that a type will have between n
and m
elements (m > 1
).
In TypeDB’s data model, cardinality constraints apply to either to the number of players of a role in a specific relation, or, the number of attributes of a specific owner. The upper bounds can be set to *
to indicate that a type can have “arbitrarily many” elements.
Let’s see the three cases where cardinality can apply.
- The definition
define marriage relates spouse @card(2..2);
means that eachmarriage
instance will have exactly2
objects playing the role ofspouse
. If we were to omit the annotation then, by default,define a relates b;
will be interpreted with cardinality@card(1..1)
. Also: as a universal rule, any relation instance must have at least one role player (even when using@card(0..m)
… orphans will be auto-deleted)! - The definition
define person owns middle_name @card(0..);
means that aperson
instance will have any number ofmiddle_name
s. By default,define a owns b;
will be interpreted with cardinality@card(0..1)
. - The definition
define person plays marriage:spouse @card(0..1);
mean that eachperson
can be in at most one marriage relation. By default,define a plays b
will be interpreted with cardinality@card(0..)
.
Violating cardinality
When inserting and deleting data, cardinality constraints come into play as follows.
- During an
insert
, violating the upper cardinality bound of either played roles or owned attributes will throw an error (e.g., trying to add a 3rdspouse
to an existing marriage will throw an error). - During a
delete
, violating the lower cardinality bound of played roles or owned attributes will throw an error (e.g., trying to delete one of the twospouse
s of an existingmarriage
will throw an error). For relations, this behavior is overwritten by the@cascade
annotation as we will learn shortly.
Cardinality and distinctness for lists
Cardinality constraints also apply to lists (recall: lists can be used exactly when modeling relation roles and attribute ownerships). This applies to two cases:
action_flow relates item[] @card(1..10);
means that eachaction_flow
relation instance will have between1
and10
ordereditem
s. By default,define a relates b[]
will be interpreted with cardinality@card(1..)
.define person owns middle_name[] @card(0..5);
means that each person can have up to5
orderedmiddle_name
s. By default,define a owns b[]
will be interpreted with cardinality@card(0..)
.
In both cases, we may further add the constraint @distinct
which constrains lists to not have duplicates.
As mentioned earlier, “empty lists” in TypeDB are the same as no lists at all (i.e. variables in patterns cannot match empty lists). With that in mind, insert
and delete
clauses work just as in the previous section.
Cascades, independence, and updates
The cardinality behavior outlined above can be modified with the following set of further annotations.
Cascade at Schema-level
Relation can be set to @cascade
when their role players are deleted. For example, the definition marriage sub relation @cascade;
would modify our delete
behavior described earlier: instead of throwing an error when a spouse
is deleted from an existing marriage
it will instead result in the deletion of the entire marriage.
Independent attributes
The @cascade
annotation does not apply to attribute ownership. This makes sense: when we delete an attribute of an entity, this rarely implies the entity should be deleted! But, by default, we will cascade along the opposite direction: i.e., if an attribute as no more owners, then TypeDB will remove it.
This default behavior can be overwritten using the annotation @independent
. For example, defining name sub attribute @independent, value string;
will ensure that even if no more owners of a given name remain in the database, name
s will still remain as independent elements of the name
type.
Updating card(1..1)
data
Cardinality interacts with another annotation: any played role or owned attribute with upper cardinality bound of 1
can be update
d. For example, given
define
ownership sub relation,
relates owner @card(1..1),
relates owned @card(0..);
then we could write a query of the form
match
$o isa ownership;
$king isa person, has name "King";
update
$o links (owner: $king);
As a result $king
will become the owner in every ownership.
Updating card(m..)
data and lists
Let’s try running the following query instead:
match
$o isa ownership;
$peasant isa person, has name "Peasant";
$freedom isa public_property, has description "Freedom of thought";
update
$o links (owner: $peasant, owned: $freedom);
Here, an error would be returned, since this insert
clause is equivalent to
insert
$o links (owner: $peasant);
$o links (owned: $freedom);
The last line contradicts the rule that udpate
can only be used for card(1..1)
situations (i.e. when we know what to update). This is not the case for the owned
role: the ownership $o
could in theory link to many owned
objects. In this case, we would not know which one of those owned
object is to be updated by $freedom
.
Importantly, the situation changes when updating list data. In this case, update
can always be used since there is a unique list to be updated. We have seen an example of this earlier!
The future
TypeDB’s new constraint language dramatically enhances the expressivity of its underlying typed data model, and addresses many important use-cases for our users. Now, we would like to hear from you: which constraints do you think will be most useful to you? And: which other constraints should we add in the future?