TypeDB Blog
What is an entity-relation database?
This blog dives into some of the math of TypeDB and our novel category of database. When we built our database, we set out to craft a generalist database that can natively express and translate between diverse data and models.
We looked at the existing database models – largely relational, document, and graph – and realized that something new was needed, which followed modern programming paradigms.
Contextually, there is a long-established subfield of classical database theory that has worked precisely towards the goal of “generalism”; The field of conceptual modeling and, in particular, of entity-relation (ER) modeling.
The language model
The idea of a “polymorphic entity-relation-attribute” (PERA) model (which underlies most conceptual modeling approaches) came from the observation that language can be usefully structured into three simplistic categories.
Let’s take a simple but common scenario:
“On an eCommerce platform, a buyer orders a listed product from a seller. Guest user Dinosaur37 sent $100 to the seller KetchupTroll12 for a used laptop”
Nouns – entities
The first category of language is common nouns, such as “users” or “products”. These describe independent domains (or types!) of objects. Specific objects, such as a named person, are referred to as ‘proper nouns’. A conceptual modeler would call a proper noun an entity, and speak of an entity type when referring to a well-described category of individual entities.
Verbs – relations and roles
Next up, we have verbs, which connect nouns through some relationship, as captured typically by “subject-verb-object” statements. For example, we could say that “a buyer purchased a product from a seller”.
A conceptual modeler would speak of a relation between these objects; in contrast, a relation type would then be an abstract description of the category of specific relations (the category of “users”).
Adjectives/adverbs – attributes
In the third category, we have adjectives and adverbs. A “used laptop” is an example from our scenario.
In conceptual modeling, these qualities of specific entities and relations are called attributes. An attribute type then references the overall category, such as “laptop condition”.
The above three categories are summarized in the table below.
Natural Language | Conceptual modeling |
---|---|
noun (user) | entity (user Dinosaur37) |
verb (user orders from user) | relation and role players (user [buyer] orders from user [seller]) |
adjective/adverb (used laptop) | attribute (product condition = used) |
But how does this relate to our earlier mention of types? The above table gives a first hint by introducing “variables” as part of our example descriptions in the conceptual modeling column.
Connecting this to type theory
Looking at the examples in the previous table, they share structural similarities to types and type dependencies. We can now show the same concepts from a pure type-theoretic perspective.
Entity and relation types are types containing objects, meaning “abstract” concepts that don’t have built-in structure. Entity types are independent types (like “user”), while relation types are dependent types (like ”buyer orders product from seller”).
Attribute types are types containing values, meaning quantifiable concepts with built-in structure (such as dates or numbers). Attribute types are traditionally considered to be dependent types (like “product condition”).
The next table summarizes the relationship between conceptual modeling and type theory.
Conceptual modeling | Type theory |
---|---|
entity type (containing the user) | type without dependencies containing objects |
relation type (containing the order) | type with dependencies containing objects |
attribute type (containing the laptop condition) | type with dependencies containing values |
What does that look like in practice?
Taking our earlier scenario, we can model that simply in TypeQL. In the below code snippet, you can see how each component is specified and called to generate the transaction.
It’s designed to be easy to read, and it doesn’t rely on joining lots of different tables – a gripe for programmers using SQL.
match
$seller isa user, has username "KetchupTroll12";
$buyer isa user, has username "Dinosaur37";
$product isa product, has name "Macbook Air";
listing ($seller, $product) has condition "Used";
insert
order (seller: $seller, buyer: $buyer, product: $product), has payment-amount 100.00dec;
Note for pedantic developers: In practice, the listing condition isn’t necessary to write this query, but we added it to show the “condition” attribute.
Our goal was to create a new class of database which is easier to work with for programmers, and easier to iterate and extend. In other words, a more maintainable database.
Summary
The classical entity-relation model was directly inspired by natural language. Incorporating Type Theory gives us another perspective on this. As a result of this shift, it becomes straightforward to include other features of a modern type system; particularly polymorphism.
This has significant ramifications for developing in different arenas, from cyberthreat intelligence to knowledge graphs and RAG & AI. We’ve seen a broad spectrum of use cases which have found deep value in the language capabilities.
In future posts we will dig deeper and show how a strongly-typed model provides a powerful foundation for a new category of general purpose database.