Officially out now: The TypeDB 3.0 Roadmap >>

TypeDB Fundamentals

Streamlining clauses for schema and data migration



This article is part of our TypeDB 3.0 preview series. Sign up to our newsletter to stay up-to-date with future updates and webinars on the topic!

When working with highly structured data the topics of “schema migration” and “data migration” are usually well-known pain points to database engineers. In TypeDB 3.0 we introduce a range of TypeQL-native syntax that provides powerful tooling to address these pain points.

Extending the clause system

To begin, let us recall the clause system of TypeDB. Clauses represent operations (like retrievals or updates) that the user intends to perform “in one go”.

Clauses can contain variables, and essentially all clauses take inputs and outputs in the for of “concept maps” which are mappings from variable identifiers to concepts (i.e. data instances, values, or types).

The single most-powerful clause is the match clause which is responsible for retrieving concepts from the database (and may even include conditions that express computations and comparisons of the concepts).

Besides the match clause, TypeQL 3.0 features the following schema- and data-level clause. As you may notice, this already shows several novelties over pre-3.0 syntax, with put/redefine/update all being newly introduce. In the rest of the article, we will discuss the role of all of the below clauses!

Schema-levelData-levelDescription
defineinsert/put 1Creates concepts and behaviors2
undefinedeleteRemoves concept and concept behaviors
redefineupdateUpdates existing concepts and behaviors3
Table 1. Schema and data language clauses in TypeQL

(1) The latter clause, put, is a conditional version of the former clause, insert, see our pipeline fundamentals for more information.
(2) Concepts and behavior refers, at the schema-level, to schema types, subtyping, dependencies, and constraints, and at the data-level, data and data inter-dependencies.
(3) This will fail if concepts to be redefined/updated do not exist!

Subjects, verbs, objects in TypeQL statements

The user of TypeDB define how data is categorized using type inheritance hierarchies, and how it interacts through dependencies using typed interfaces. This gives a “typed” perspective of classical RDF subject-verb-object (SVO) triples. Let’s explain this! For example,

(company)—(named)—(“Apple”)

could translate into a TypeQL statement:

$company has name "Apple"

In TypeDB, the verb “name” becomes an interface type that now controls who exactly “owns names”. The type system checks whether it can cast the data instance $company into a name owner, while "Apple", the object, must be data in the type of names owned by $company. In this way, TypeDB provides a clean way of “typing” core concepts of the semantic data model.

When manipulating the databse with subjects, verbs, and objects, ambiguity can arise has to what we are manipulating: the subject? the verb? the object? This is why in TypeDB 3.0 we depart from strict SVO ordered statements, and also employ statement that focus on the object or verb instead—we call these statements O/V statements. The following provides an idea of where in the above clauses these new statement forms will be used:

  • define: Uses SVO.
  • undefine: Uses O/V (with special new keyword: from). As a first brief example, we could write undefine owns name from person.
  • redefine: Uses SVO.
  • insert/put: Uses SVO.
  • delete: Uses O/V (with special new keyword: of). As a first brief example, we could write delete name of $x;.
  • update: Uses SVO.

Now, let’s see in more detail how (SVO and O/V) statements and clauses fit together.

Fitting statements to clauses

We will discuss key TypeQL statements in order. These include:

  1. Subtype statements using sub, and data-level typing using isa.
  2. Relation dependencies (via role interface types) using relates, and data-level dependencies using links.
  3. Role subtypes using relates ... as ... (no data-level counterpart).
  4. Combined attribute dependencies (via ownership interfaces) and ownership implementation using owns, as well as data-level dependencies using has
  5. Ownership implementation overrides using owns ... as ... (no data-level counterpart).
  6. Role implementations using plays (no data-level counterpart).
  7. Role implementation overrides using plays ... as ... (no data-level counterpart).
  8. Various annotation statements at schema-level (which do not have data-level counterparts).
  9. Type labels !

Observe that, generally, interface implementations and their overrides are statements that do not have data-level counterparts, meaning they will not be relevant to data-level clauses from Table 1.

1. Subtypes and typing (sub/isa)

Typing affects both the data-level and the schema-level. For the former, note that data is always created in a specified type (its canonical type). For the latter, note that types are organized into a subtyping hierarchy. This means data may live in multiple types simultaneously (by casting it from types into super-types).

  • define: We may define, e.g., define entity person for entity types without a supertype (this is a synonym for define person sub entity but clarifies that no user-defined supertype is intended). Otherwise, we define person sub being, which declares person as a subtype of being.
  • undefine: We may undefine types, e.g., by writing undefine entity person or simply undefine A. Note: no sub is needed! We are undefining the type A itself.
  • redefine: We may redefine redefine person sub intelligent_being, which redefines the parent of person.
  • insert/put: We may insert insert $a isa person, which insert a new data instance with canonical type person.
  • delete: We may delete by writing delete person $a, or simply delete $a, without specifying the type. Note: if A is an attribute type, then this requires the type to be @indepdent, otherwise attributes cannot be globally deleted in this way (to avoid accidental deletion for multiple owners).
  • update: For now, we cannot update the canonical types of data with an update clause, as the operation comes with subtle difficulties. Instead, users may change types with a copy-and-delete pipeline (eg, match $old_x isa A...; insert $new_x isa A'; followed by match $old_x's data; insert $new_x's data; and delete $old_x;).

2. Relation dependencies (relates/links)

Relations are dependent data, meaning in many cases we will delete the relation objects if the objects it links are deleted (for example, an edge should be deleted if the nodes it links are!). The type system of TypeDB controls which data a relation instances can link via so-called role interface types, that other types can implement (see #6). Role interfaces and their members, role players, can be defined/inserted/manipulated as follows.

  • define: We may define, e.g., that define marriage relates spouse which states that the relation type A can link roleplayers of type R.
  • undefine: We may, similarly, undefine relates spouse from marriage which removes the role interface from the relation A. Note, this uses the new from keyword! Recall also, this is an O/V ordered statements (i.e. the object/verb comes first here: relates spouse).
  • redefine: We cannot redefine role interfaces of a relation in a meaningful way: whether or not a relation depends on a role interface is a boolean, which cannot be meaningfully redefined. But see #9 for how change the label of a role interface.
  • insert/put: We may insert, e.g., insert $m links (spouse: $a).
  • delete: Recall, we delete using O/V order: for example, we can write delete (spouse: $a) of $m … this uses the new of keyword!
  • update: We can update role players, by writing update $m links (spouse: $b). Warning: this replaces all existing roleplayers of the given type! In our example, $b would replace all existing spouses linked by $m.

3. Role sub-interface (as)

When a relation type subtypes another relation type, we may have its role interfaces subtype the role interfaces of its relation supertype. This (schema-only) behaviour can be addressed as follows.

  1. define: We may define role sub-interface behaviour using the keyword as, e.g. by writing define hetero_marriage relates wife as spouse.
  2. undefine: If we want to completely free a sub-interface from its parents, we can write undefine as spouse from hetero_marriage relates wife.
  3. redefine: If we want to sub-interface a different role of the parent relation, then we can write redefine hetero_marriage relates wife as primary_spouse (assuming marriage was defined with a role inferface primary_spouse).

4. Attribute dependencies and ownership implementation (owns)

Similarly to relations depending on the object they link, attributes depend on the objects they are “owned by”; in other words, by default, if an attribute’s (sole) owner gets deleted then we delete the attribute as well! The type system controls which types of attributes can be owned by which type of objects through ownership interface types that other types can implement

  1. define: We may define person owns name which means that person implements the ownership interface of name; simply put, person instances are now valid owners of name instances!
  2. undefine: The remove an ownership implementation, write undefine owns name from person. Note again: this uses O/V statements?
  3. redefine: We cannot redefine interface implementations since these are a boolean truth values, which cannot be meaningfully redefined. But see #9 for how change the label of an attribute type.
  4. insert/put: Inserting attribute instances works as before: insert $person has name "Marry". This makes $person the owner object of a name attribute "Marry".
  5. delete: Deleting now uses O/V order: delete $name of $person;, which may optionally also carry a type delete name $name of $person in which case we may even supress the instance all-together, simply writing, delete name of $person.
  6. update: Updating a name can now be done using update $person has name "Merry". Note the type is absolutely needed in this case!

5. Ownership implementation overrides (as)

When subinterfaces are available, overrides of interface implementations in TypeDB are a kind of constraint: they constrain a subtype to only ever use the subinterface, while the parent type implements the superinterface as well. Ok, this was quite abstract—let’s make it more concrete!

  1. define: We can define an ownership override using define child owns first_name as name. This assumes child sub person and persons owns name. As a result, child objects will only be able to be owners of first_names, i.e. we cannot insert $some_child has name "John Doe".
  2. undefine: We may remove an ownership interface override by write undefine as name from child owns first_name. As a result, child objects could now be designated owners of both name and first_name instances (note, thought, super-attributes are required to be abstract).
  3. redefine: We cannot redefine interface overrides, as they are boolean truth values.

6. Role implementation (plays)

Similar to ownership interfaces implementation, object types (i.e. entity and relation types) may implement role interfaces.

  1. define: We can define define person plays marriage:spouse to ensure that person objects are valid for the spouse role.
  2. undefine: Using O/V order, we may write undefine plays marriage:spouse from person.
  3. redefine: We cannot redefine interface implementations since these are a boolean truth values, which cannot be meaningfully redefined. But see #9 for how change the label of an attribute type.

7. Role implementation overrides (plays)

Again in analogy with ownership implementation overrides, role implementations may be overridden: the effect is the same, in that overrides constraint a subtype to not make use of the interface inherited from its parent type.

  1. define: We can use define hetero_man plays hetero_marriage:husband as marriage:spouse to define a role implementation override. This assume hetero_man sub person and person plays marriage:spouse.
  2. undefine: As before, we can use undefine as marriage:spouse from hetero_man plays hetero_marriage:husband to remove the override. As a result, hetero_man may now also be inserted as marriage:spouse roleplayers.
  3. redefine: We cannot redefine interface overrides, as they are boolean truth values.

8. Annotations

Besides overrides, TypeQL 3.0 brings a large variety of new annotations. Importantly, all these annotation follow the some principles. Firstly, each statement can have at most one annotation of a given kind (i.e. we disallow, e.g., person owns name @card(1..2) @card(1..3)). Secondly, annotation can be addressed using the following definition language.

  1. define: We define annotations as part of the usual definition of statements, e.g., define person owns name @card(1..2).
  2. undefine: We remove annotation with O/V statements of the form undefine @card(1..2) from person owns name.
  3. redefine: Most annotation can be meaningfully redefined, e.g., define person owns name @card(1..3) replaces the previous annotation of @card(1..2).

9. Labels and aliases

Labels are a new feature of TypeDB 3.0. They provide aliases for types, which allow a more flexible approach to querying, schema, and data migration. Every time has a primary label, which is the label it was originally introduced with. A non-primary label is also called an alias.

  1. define: Additional (non-primary!) labels are defined using define person alias personne. Now our database is multi-lingual! Note: person can be either the primary label or any alias of the type to which we want to add an additional alias.
  2. undefine: Aliases can be undefined using undefine alias personne from person. Again this uses O/V order. And again, person can be either the primary label or any alias of the type from which we want to remove the additional alias.
  3. redefine: The primary label of a type can be redefined using redefine old_type label new_type. Aliases cannot be redefined, as they either exist or not.

Powerful migrations

The design of TypeDB’s migration language using update and redefine statements closely follows the fundamental principles of its type system. Taken together this allows for powerful migration operations in extremely simple syntax. For example,

match $T sub! person;
redefine $T sub humanoid;

moves all subtypes of person under a new supertype, humanoid. And

match person owns $T;
define humanoid owns $T;

copies all attributes from person to humanoid. That’s simple and intuitive!

Share this article

TypeDB Newsletter

Stay up to date with the latest TypeDB announcements and events.

Subscribe to Newsletter

Further Learning

The TypeDB 3.0 Roadmap

The upcoming release of version 3.0 will represent a major milestone for TypeDB: it will bring about fundamental improvements to the architecture and feel, and incorporate pivotal insights from our research and user feedback.

Read article

Pipelines (3.0 Preview)

In this article we describe how, from a set of basic query operations, complex data pipelines can be crafted in TypeDB.

Read article

Type-checking

Learn about the simple design principles underlying keywords, variables and bindings, and type-checking of TypeQL patterns.

Read article

Feedback