Officially out now: The TypeDB 3.0 Roadmap

TypeDB Fundamentals

Handling data series and serialized query results with lists



This article is part of our TypeDB 3.0 preview series. Sign up to our newsletter to stay up-to-date with future updates and webinars on the topic!

The ability to work with lists is a key part of functional programming, and it naturally fits into TypeDB’s functional database programming model. In fact, the introduction of lists to TypeDB’s type system addresses two database-specific issues at once:

  1. Order is an ubiquitous feature in data: series of measurements, ledgers of records, paths of nodes, or compositions of relations, etc., are all common examples of ordered data. In TypeDB 3.0, lists provide an ordered counterpart to the unordered set semantics of bare types, allowing us to encode data series.
  2. We often want to refer to, or iterate through, the results of some query from within another query; having ordered access to these results matters when data is sorted or we want to compare results in-order. In TypeDB 3.0, we can aggregate queries in lists, thereby serializing their results, and work with them from within another query.

At a glance: List syntax essentials

List variables are always introduced together with their list type, and all list types are consistently indicated using the suffix []. For example: bool[] is the type of lists of booleans, names[] the type of lists of names, and $t[] is the type of lists of $ts where $t is a variablized type! Lists come with a range of new syntactic expressions, including:

  • Constructing lists from elements as [a, ..., z].
  • Accessing members at an index [i] and forming ranged sublists [i:j].
  • Binding a variables to be a member of a list using the keyword in.
  • Aggregating streams into lists using the list reduction.
  • Computing a list’s length with length($list).
  • Concatenating lists with the operator +.

Now, let’s see some examples!

Lists of roleplayers

We’ve already met lists of objects when we worked out how to query for multi-leg flights. Multi-leg flights were precisely lists of flights where each flight connected to the next one. We may abstract such a construction at the schema level, by defining types that relates lists of (roleplaying) objects. As a simple example, let us define a action_flow relation, which collects lists of actions for our company.

define 
  action_flow sub relation, relates item[];
  customer_action sub entity, plays action_flow:item;

Note that, as remarked earlier, all list types are indicated using the suffix []. So our action_flow directly relates a list type, namely, the type of lists of items, where item itself is called a role type of action_flow. In the second line we define that each customer_action can play the role of such an action_flow:item;. This means that lists of action_flow‘s can contain customer_action objects.

Inserting lists

Let’s see how this works in action by inserting some data:

insert
   $visit isa customer_action, has info "customer visits website";
   $register isa customer_action, has info "customer registers";
   $subscribe isa customer_action, has info "customer subscribes";
   $pay isa customer_action, has info "customer pays us!";
   (item[]: [$visit, $register, $subscribe]) isa action_flow; 

The above inserts a new action flow, which links together the ordered sequence of three action flow items: $visit, $register, and $subscribe. Note that to create the required list, we use the list constructor syntax [e1,e2,...] which constructs a list from given elements e1, e2, ....

Reading and formatting lists

Now that we have list data, we can read it from our database. In the simplest form this could be achieved by:

match 
  $flow isa action_flow, links (item[]: $flow_items);
fetch
  $flow as "flow_title": title;
  $flow_items: info as "item_info";

This matches all $flows and the ordered $flow_items they link together, and outputs JSONs that, for each flow list (for key "flow"), lists each object (for key "item") with their info attribute. The formatted results may look like this:

{
  "flow_title": "New User actions",
  "flow_items": [ 
    { "item_info" : <info> },
    { "item_info" : <info> },
    ...
  ]
}, ...

where <info> would be JSON-formatted info attributes.

List membership

To extract the members of a list you can use the in keyword, which represents the pattern that states “the left-hand side variables is a member of the right-hand side list”. For example, if we wanted to retrieve only those flows that include the subscription stage, we could query:

match 
  $flow isa action_flow, links (item[]: $flow_items);
  $subscription in $flow_items, has info "customer subscribes"; 
fetch 
  $flow_items as "subscription_flow_items": info as "item_info";

Patterns fail for empty lists

Importantly, a solution to a pattern will never assign an empty list to a variables. In other words, in our example above the variable $flow_items will only ever match non-empty lists.

Indeed, in our database model, empty lists are equivalent to absent lists. This ensures the “ordered set” semantics of lists nicely corresponds to the “set” semantics of non-list types. Similarly, we do not allow inserting empty lists.

Indexed access and list length

To access a specific members of a list $list we can used indexed access $list[EXPR] where EXPR is a valid integer expression. If EXPR is out of list bounds the pattern will fail to match results. We can use length($list) to dynamically match the length of a list.

The following query retrieves the items and their positions for a specific action_flow with title "my_flow":

match 
  $flow isa action_flow, links (item[]: $flow_items), has title "my_flow";
  $item = $flow_items[$position], has info $info;
fetch
  $info as "item info";
  $position as "item position";

Updating lists

How can we modify existing lists of roleplayers? Basically, all the usual list operations are at your disposal as mentioned in our earlier overview, including, e.g., concatenation, indexed access, ranged sublists, list length, etc.

For example, to extend all action flow that end with a subscription by one more “payment” step, we could run the following query.

match
  $flow_to_subscription isa action_flow, links (item[]: $items);
  $last_index = length($items) - 1;
  $items[$last_index] has info "customer subscribes";
  $pay isa customer_action, has info "customer pays us!";
  $extended_items = $items + [$pay];
insert
  $flow_to_subscription links (item[]: $extended_items) @replace;

The query is mainly self-explanatory. It makes use of the annotation @replace which can be applied to statements of the form $x has "sth" or $x links "sth", when we are working a "sth" of which there can be at most one such thing.

For example, in the above case, there can be at most one item[] list linked by the action flows $flow_to_subscription. Since there can be at most one, we therefore know what to replace! The introduction of @replace makes working with role player lists (and, similarly, with attribute lists or struct values) extremely convenient, as the our example illustrates.

Lists of attributes

Parallel to relations relating lists of role players, we can have objects (i.e. entities or relations) owning lists of attributes. Let’s jump straight into an example: consider the following schema which defines that person

define
  person owns middle_name[];
  middle_name sub attribute, value string;

Again, the attribute type middle_name is itself not a list type; its members will be single (string-valued) attributes as usual. But person objects may have lists of middle_names. Inserting, accessing, manipulating, and updating of such lists works similarly to the case of relations relating lists of role players, which we have just seen.

Modeling and querying with lists

When working with attribute or roleplayer lists the following will be useful to keep in mind.

Where and where not to model with lists

In our data model (i.e. the part of the schema comprising entity, relation, and attribute type definitions), relates and owns are the only two statements defining our data model which can refer to list types.

However, besides our data model, lists may freely be used in query patterns and functions. We show many of these cases in the companion articles.

A note on list variables

List variables appear in only statements in which their list-type is indicated. For example, when writing $x has $a then $a will always match attributes and not lists thereof. If $t-lists $a are intended, we may write $x has $t[] $a instead!

This remark similarly applies to roleplayer lists: $x links ($a) will only match a single roleplayer $a and never a list of roleplayers. As in the case of attributes, we may write $x links ($t[]: $a) if lists are meant to match the pattern.

Syntactic sugar for list members

Let us also mention that even when person owns middle_name[] we may, in fact, still query

match $person isa person, has middle_name $name;

Indeed, this is considered to be syntactic sugar for:

match
  $person isa person, has middle_name $_list;
  $name in $_list;

where $_list is a new implicit variables.

This similarly applies to links: when action_flow relates item[] we can still query for action_flow links (item: $item); — this is simply syntactic sugar for action_flow links (item[]: $_list); $item in $_list;.

Reducing query streams into a single list

As mentioned in the beginning, part of the power of lists derives from their interaction with query result streams. As a first simple example, consider the following query.

with fun available_car_model_list($l: location) -> car_model[] :
  match
    (location: $l, zone: $z) isa zone_association; 
    (car_model: $m, zone: $z) isa zone_availability;
  return list($m);
match
  $req (customer: $cust) isa rental_inquiry, has status "open"; 
  $cust has location $loc;
  $models = available_car_model_list($loc);
insert
  (customer: $c, models[]: $models) isa offer_to_be_mailed;
  $req has status "processed" @replace;

This query stores results of another query as an ordered list in our database (arguably, we are not making much use of the order here but we could: e.g. ordering models by how suitable they are for the customer!).

Let’s go through the query’s steps in more detail!

Reducing streams to lists

The query starts with a query-level function definition: the function available_car_model_list matches car models that are available in a certain zone (to which the given location $l belongs).

Importantly, the function’s return statement uses yet another reduction: list. This reduction aggregates all results for the variable $m found in stream defined by the function’s body into a single list.

Note: lists can, in general, contain duplicates! We can avoid this behavior if required by first filtering for $m using the stream modifier filter. The filtering modifier de-duplicates our stream, as we discuss in the pipeline fundamentals article.

Putting list reductions to use

Equipped with our available_car_model_list function, in the subsequent match clause we then search our database for open rental inquiries by our customers, retrieving their location $loc and assigning the list returned by available_car_model_list($loc) to the variable $models.

In the final insert clause, we record that the model list selected for the customer should be an offer_to_be_mailed and record that we process the request.

Lists are a highly versatile tool. In their capacity of recording multiple values at once, they are closely related another new TypeDB 3.0 concept: custom-structured value types, which we discuss next.

Summary

Lists make it possible to hold many data points in a single variable — in this article, we’ve only outlined a few basic use-cases, but, in general, lists are an extremely powerful tool. We are, therefore, excited to see how you will be using lists your TypeDB application!

Share this article

TypeDB Newsletter

Stay up to date with the latest TypeDB announcements and events.

Subscribe to Newsletter

Further Learning

The TypeDB 3.0 Roadmap

The upcoming release of version 3.0 will represent a major milestone for TypeDB: it will bring about fundamental improvements to the architecture and feel, and incorporate pivotal insights from our research and user feedback.

Read article

The PERA Model

Learn about the individual components of the Polymorphic Entity-Relation-Attribute model, and how these components can be efficiently and naturally accessed through TypeQL.

Read article

TypeDB's Core Features

Explore TypeDB's core features in detail, including validated polymorphic data models and declarative querying, and learn about their impact on database engineering.

Read article

Feedback