Graph databases, complex data, and the case for a structured hypergraph

TypeDB is a hypergraph database with a native type system, combining n-ary relationships with typed, inheritance-aware schemas.

Cal Hemsley


Mapping data to a system has always represented some form of trade-off. With the introduction of property graphs, we suddenly found a new way to model complex systems that didn’t lend themselves naturally to the tabular structure of the relational model.

Simply, when you store a relationship in a property graph, you connect two nodes with an edge. Company A knows Company B. Bank X advises Client Y. This allows us to create maps of an ecosystem which allows us to explore how different concepts relate together in a more natural way.

For modelling some domains, this structure works reasonably well, but challenge comes since many meaningful relationships aren’t between two things. They’re between three, or four, or they involve context that belongs to the relationship itself, not to either participant.

Even seemingly simple relationships can unfold to be more complex, and we find ourselves making new compromises to develop the system.

The property graph model

The property graph model has this binary constraint baked into its foundations: every edge connects exactly two nodes. This is how it’s designed; a property graph is formally a set of nodes and a set of directed, labelled edges, where each edge has exactly one source and one target. Everything is built on top of that concept, which was a deliberate design choice.

Binary edges are simple to optimise for and fast to traverse, and for a lot of problems, that’s exactly the right call. Mapping long chains of relationships can be expensive, and this allows for a great degree of optimisation.

But for domains where relationships naturally involve more than two participants, it means every data model is a compromise before you’ve written a single query.

What is a hypergraph, and how does it differ from a property graph?

A hypergraph generalises the graph model by removing the two-node constraint. In a standard graph, an edge connects exactly two nodes. In a hypergraph, an edge (called a hyperedge) can connect any number of nodes. Formally, where a graph edge is a pair {A, B}, a hyperedge is a set {A, B, C, D…} of arbitrary size.

For the mathematicians, this is the super-set containing graph. The quadrilateral to their rhombus. Property graphs are a special case of a hypergraph,where every hyperedge happens to have exactly two participants.

TypeDB trades off the simplicity of a binary edge for the expressive power of n-ary relations.

Image source – Courtesy of Communications Physics

In practice, this means a hypergraph like TypeDB can represent an acquisition (acquirer, target, mandated bank, lead advisor, legal counsel) as a single relationship, not five nodes connected by five edges. One relationship with five participants. The structure matches the domain.

What a type system adds to a hypergraph

A hypergraph alone solves the n-ary problem but introduces a new one: without constraints, the model is too permissive. Any node can participate in any relationship in any role. The database stores what you give it, but it doesn’t understand what anything means which makes it hard to enforce correctness, hard to query consistently, and hard to evolve over time.

This is what TypeDB adds: a strongly-typed schema that sits on top of the hypergraph and gives it semantic structure. In TypeDB, every entity, relationship, and attribute has a type. Relationships define named roles, and only entities of the correct type can play each role. Types form inheritance hierarchies — a LeadAdvisor is an Advisor is a BankEmployee — and the database understands these hierarchies natively.

The practical consequence is significant. In a schema-less or weakly-typed hypergraph, you can model an acquisition, but nothing stops you from inserting inconsistent data, and every query has to defensively account for every possible variation. In a structured hypergraph, the schema encodes the rules of the domain: what roles exist, what types can play them, how types relate to each other, and the database enforces and reasons over those rules automatically.

This means the schema isn’t just acting as documentation. It encodes the conceptual model of the domain the taxonomy, the role constraints, the inheritance structure. It does this in a place where the database can actually leverage it. Queries become more natural. The model becomes more resilient to change. And new questions can be asked without rebuilding the foundation.

Example: modelling complex financial relationships in TypeDB

Take a deal between two companies, each with their own advisor. In TypeDB, that’s a single four-way relationship:

typeql

deal (target: $target, buyer: $buyer, target-advisor: $advisor, buyer-advisor: $advisor);

One line, with four typed participants, and notice that $advisor appears in both advisor roles simultaneously. This means a conflict of interest isn’t something you need to detect with a complex query; it’s visible directly in the data model, with the structure naturally surfacing it.

In a property graph, the same scenario requires intermediate nodes to represent the deal, separate edges for each advisory relationship, and a query that reconstructs the role structure the schema never encoded. The question becomes harder to ask the moment it gets more complex, and in financial services, the questions always get more complex.

Where property graphs still win

A structured hypergraph is not the right starting point for every problem, and it’s worth being direct about why.

The graph ecosystem is mature in a way TypeDB’s isn’t yet. Neo4j has been in production for fifteen years. The tooling, the operational knowledge, the hiring pool, the community.. These are real advantages that don’t disappear because the underlying model is more expressive. For teams operating in well-understood territory with existing graph expertise, that weight counts for something.

There’s also a more fundamental tradeoff around schema. Property graphs are largely schema-optional, which means you can start with a rough model and let it evolve organically as your understanding of the domain develops. TypeDB requires you to define structure upfront, though it is designed for evolution and re-factoring. That’s a feature when the domain is well understood, because it enforces correctness and enables reasoning. But it’s a genuine cost when you’re still figuring out what the domain is. Exploratory work, early-stage products, fast-moving data; these are cases where the flexibility of a schema-light graph can be genuinely valuable, and where the discipline TypeDB demands can drive you to move slower initially before it speeds you up as the model expands in complexity.

The structured hypergraph earns its place when the domain is known, the relationships are complex, and correctness matters. When those conditions hold – and in this example domain, they rather tend to do so – the tradeoffs shift decisively.

A new category of graph database

TypeDB is a different abstraction entirely from a classical property graph, where the model is expressive enough to reflect the domain accurately, and structured enough that the database understands what it’s storing.

That combination – the structured hypergraph – is what makes it possible to ask questions the data was never explicitly prepared for.

Share this article

TypeDB Newsletter

Stay up to date with the latest TypeDB announcements and events.

Subscribe to newsletter

Further Learning

Lorem ipsum

Feedback