Pattern matching essentials

A pattern is a set of statements.

Every statement ends with a semicolon and consists of the following:

  • variables,

  • keywords,

  • types,

  • values.

TypeQL statements constitute the smallest meaningful building blocks of queries. Please find below a closer look at the typical statement structure.

Statement structure
  • A TypeQL variable is prefixed with a dollar sign $ (for concept variable) or a question mark ? (for value variables).

    Most statements start with a variable (marked with V on the image above) providing a reference to a concept or value (one per every solution found). For more information on variables, see the Variables section.

  • A variable is followed by a comma-separated list of constraints (p1, p2, p3) describing the concepts the variable refers to.

  • The end of the statement is marked with a semicolon (;).

Solving a pattern as a system of equations

A pattern can be thought of as a system of equations where every equation is a single statement.

TypeDB will solve the system to find all solutions. Every solution includes a concept (type or instance of data) for every concept variable and value for every value variable.

It’s important to understand that the result of a match clause with any pattern is a set of solutions found. The length of the set is equal to the number of solutions found. Hence, it can be:

  • Zero — No solutions found (nothing matched the pattern). Empty set returned.

  • One — Exactly one solution found and returned in a set.

  • Many — Multiple solutions found, including all possible permutations. All of them returned in a set.

For example, in the statement illustrated above, TypeDB will find all solutions that include a person entity that has a full-name attribute with the value Masako Holley and has any email.

What if no person entity has this particular name in a database? Then a set of zero solutions (answers) will be returned, regardless of the emails.

What if there is one person with the full-name attribute like that, but it has two email attributes? Then TypeDB will find two solutions/answers. And every answer will include that one person entity and one of the emails.

A match clause returns all possible solutions (combinations of variables values that make a true statement from the pattern), including all possible permutations. It’s vital to understand how a match clause matches patterns. Especially if it’s used as a part of an insert or a delete query because the insert or the delete clause in these will be executed once for every matched result from the match clause used.

For example, we match all instances of two types without any constraints. Let’s say one of the types has two instances, and the other one has ten. Then the match clause returns all matched combinations, including all permutations: 20 results in total. The first instance og the

Variables

There are two types of variables:

  • Concept variable — used by default in most queries. It references a concept (an instance of data or a type).

  • Value variable — used only in computation, like arithmetic operations and other built-in functions requiring direct operation on values.

A concept variable starts with the dollar sign $ followed by a variable label (for example, $x) and references exactly one concept (type or data) per every answer (solution) found for a pattern.

A value variable starts with question mark ? followed by variable label (for example, ?x) and reference exactly one value per every answer (solution) found for a pattern. See Value variables below.

Comparison operators

The following operators are supported for comparing attribute values: ==, !=, >, >=, <, and <=.

In TypeDB version 2.18.0, the = sign as a comparison operator was deprecated, as it is now being used to assign values to value variables.

We recommend using == for comparison instead.

The old syntax, for example:

$p = $u;

will be supported for backwards compatibility for a limited time (if it’s used with a concept variable on the left from the = sign).

It will be removed from the TypeQL syntax in later versions of TypeDB.

Value variables

In addition to the default concept variables that address concepts inside a TypeDB database, a special variable type is used only for computation.

They are called Value variables.

Instead of the dollar sign (e.g., $p), value variables use the question mark (e.g., ?x) preceding the variable label.

Instead of concepts, value variables represent exact value in a pattern.

To set a value variable with some value, we can use the = sign: with the value variable on the left from it and on the right side — we shall have an expression. Value variables are never materialized permanently and are only used within the scope of a particular query or rule.

To persist the value of a value variable, we can use an attribute type with a matching value type. Value variables can be one of the following value types (the same as value types for attributes):

  • long

  • double

  • boolean

  • string

  • datetime

An expression describes a computation of the value for a value variable. It contains any combination of the following elements:

  • constant, set in a query (e.g., ?x = 4),

  • value of a concept variable (it should be an attribute to have a value) or value variable bound in the query,

  • arithmetic operation,

  • other built-in function.

See an example of a query with expression
match
  $s isa size-kb;
  ?x = round($s/2) + 1;

The query above will find all instances of data for the size-kb attribute type for a concept variable $s. For a value variable ?x, we divide the value of the instance of the attribute in $s by a constant value 2, rounding it, and adding 1 to the result. Hence, every result for this query consists of $s and ?x:

  • $s equals to a value of an attribute of size-kb type,

  • ?x equals the result of the computation described in the query, which depends on the attribute’s value.

Computation

Arithmetic operations

The following keywords can be used for arithmetic operations between value variables, values of attributes that are stored in concept variables, or constants:

  1. () — parentheses. See an example.

  2. ^ — exponentiation (power). See an example.

  3. * — multiplication. See an example.

  4. / — division. See an example.

  5. % — modulo. Returns the remainder of a division. See an example.

  6. + — addition. See an example.

  7. - — subtraction. See an example.

The above list is sorted by the order in which those operations are applied in an expression.

See example
$f isa file, has size-kb $s;
?mb = $s/1024;
?mb > 1;

In the example above, we designed a pattern to find instances of data for file type owning size-kb attribute with a value that, after dividing it by 1024 (to get megabytes out of kilobytes), is bigger than 1.

Built-in functions

Built-in functions are usually invoked with adjacent parentheses that contain arguments to apply the function. Those arguments are separated by a comma as a separator. The following built-in functions are available in TypeDB:

  • min — the minimum of the arguments. See example.

  • max — the maximum of the arguments. See example.

  • floor — the floor function (rounding down). See example.

  • ceil — the ceiling function (rounding up). See example.

  • round — the default (half-way up) rounding function. See example.

  • abs — the modulus (or absolute value) function. See example.

Combining statements

Combining statements

By combining statements together, we can express more complex pattern scenarios and their corresponding data.

  • Statement: A single basic building block, as explained above.

  • Conjunction (logical AND): A set of statements, where to satisfy a match, all statements must be true.
    We use conjunctions by default just by separating the partaking statements with semicolons ;.

  • Disjunction (logical OR): A set of statements, where to satisfy a match, at least one statement must be matched.
    We form disjunctions by enclosing the partaking statements within curly braces {} and joining them together with the keyword or.

  • Negation (logical negation): A statement that explicitly defines conditions that must not to be met.
    We form negations by defining the conditions not to be met in curly brackets of a not {}; block.

See the complex example below.

Complex example

Complex example

The pattern is a conjunction of five different pattern types:

  1. Conjunction 1 specifies the variables for two person instances, their full-names, action, and file that has path README.md, specifies their types.

  2. Disjunction specifies that the actions of interest are either modify_file or view_file.

  3. Negation 1 specifies that person $p1 shall not have full-name with value of Masako Holley.

  4. Negation 2 specifies that person $p2 shall not have full-name with value of Masako Holley.

  5. Conjunction 2 specifies that the file has access with the action that we specified earlier, and both instances of the person to have the permission to the specified access.

In short, the above example finds pairs of people who both have permission to access the same file with a path of README.md. The pattern additionally specifies the access to be either modify_file or view_file, and neither people to have the full-name Masako Holley.