Data migration
Migrating data from other sources into TypeDB is an important step. It is considered that we have already developed a schema for the TypeDB database according to the data model. Accurate and comprehensive schema design will simplify not only data migration but also the design of data queries later on.
This tutorial is based on the Phone calls example in TypeDB Examples repository. Advanced users may find it faster to follow the readme files for Java, Python, Node.js versions directly. |
To load data into a TypeDB database we need to perform insert queries.
The following example showcases how to load sample data from
CSV
, JSON
, and XML
formats using TypeDB Drivers for
Java,
Python, and
Node.js.
The same approach could be used to load data from other sources (e.g., from REST API, or other DB engines using SQL, GraphQL) or using other TypeDB Clients.
Schema & data
The application described on this page is built to migrate and use the phone_calls
database schema and dataset,
which are both provided below.
Schema
The schema of the phone_calls
database is defined by the following file:
See the visualization below.

The schema in the file above consists of the following types.
Entity types:
-
person
-
owns
first-name
attribute type -
owns
last-name
attribute type -
owns
phone-number
attribute type -
owns
city
attribute type -
owns
age
attribute type -
owns
started-at
attribute type -
owns
duration
attribute type
-
-
company
-
owns
name
attribute type
-
Relation types:
-
call
-
role
caller
played byperson
type -
role
callee
played byperson
type
-
-
contract
-
role
customer
played byperson
type -
role
provider
played bycompany
type
-
Attribute types:
-
first-name
ofstring
value type -
last-name
ofstring
value type -
phone-number
ofstring
value type -
city
ofstring
value type -
age
oflong
value type -
started-at
ofdatetime
value type -
duration
oflong
value type -
name
ofstring
value type
Use TypeDB Studio or any other TypeDB Client to create a new database
named phone_calls
in a TypeDB server and import the schema
file.
The step-by-step instructions how to create a database and load a schema file with TypeDB Studio can be found in the Quickstart guide. Make sure to use the schema file provided above instead of the IAM schema code/file from the Quickstart guide.
Application
In this example we will conduct migration by building an application to read through dataset files above and insert data into the TypeDB database created.
The Phone Calls example application showcases:
-
importing dataset into a TypeDB database from:
-
CSV
-
JSON
-
XML
-
-
performing TypeQL queries to the database with imported data.
Pre-requirements
For any of the following language specific variant the following software is required:
|
-
Java
-
Python
-
Node.js
For Java sample application the following software needs to be installed:
-
Bazel
(v5.1.1)
For Python sample application the following software needs to be installed:
-
Python
(v.3.10.5+), -
pip
package manager.
For Node.js sample application the following software needs to be installed:
-
Node.js
(v16.16.0 LTS), -
npm
package manager.
Java
Clone the repository
Clone the TypeDB Examples repository to the local machine.
Start migration
Build the example with the bazel build //telecom/phone_calls/java/...
command.
Use one of the following commands, depending on the chosen input format:
-
CSV
-
JSON
-
XML
bazel run //telecom/phone_calls/java:csv-migration
bazel run //telecom/phone_calls/java:json-migration
bazel run //telecom/phone_calls/java:xml-migration
Application logic
Data migration code is located inside {format}Migration.java
file, where {format} is
CSV,
JSON, or
XML.
The {format}Migration
class calls the initialiseInputs()
method, which returns a collection of Input
called
inputs
. We then use each element in this collection to load each file from dataset into TypeDB database with
connectAndMigrate(Collection<Input> inputs)
method.
The Input
subclass contains details required for reading data from dataset files of a chosen format:
path to the file and the template function to produce a TypeQL insert query out of every element of the
file. Thus, for every file, there is a separate template for forming a TypeQL insert query using the data from the
file.
The connectAndMigrate
initializes TypeDB Java Driver, opens a session to the TypeDB database, and calls
loadDataIntoTypeDB
for every element in inputs
.
The loadDataIntoTypeDB
parses the data from a file using a path from the inputs
element, and for every item
parsed from the dataset file it calls input.template
function (which is specific for this file, defined by its path).
Template returns a TypeQL insert query in a form of a string, and it then is sent as a part of a write transaction
to the TypeDB database.
Insert query examples
To insert an instance of a company
type we have the following template:
inputs.add(new Input("data/companies") {
@Override
public String template(Json company) {
return "insert $company isa company, has name " + company.at("name") + ";";
}
input.getDataPath()
returns data/companies
.
Given the company
item is:
{ name: "Telecom" }
input.template(company)
returns:
insert $company isa company, has name "Telecom";
To insert an instance of a person
type we have the following template:
inputs.add(new Input("data/people") {
@Override
public String template(Json person) {
// insert person
String typeQLInsertQuery = "insert $person isa person, has phone-number " + person.at("phone_number");
if (! person.at("first_name").isNull()) {
typeQLInsertQuery += ", has first-name " + person.at("first_name");
typeQLInsertQuery += ", has last-name " + person.at("last_name");
typeQLInsertQuery += ", has city " + person.at("city");
typeQLInsertQuery += ", has age " + person.at("age").asInteger();
}
typeQLInsertQuery += ";";
return typeQLInsertQuery;
}
Given the person
item is:
{ firs-name: "Jackie", last-name: "Joe", city: "Jimo", age: 77, phone_number: "+00 091 xxx"}
input.template(person)
returns:
insert $person isa person, has phone-number "+44 091 xxx", has first-name "Jackie", has last-name "Joe", has city "Jimo", has age 77;
if the person item doesn’t have first-name
:
{ phone_number: "+44 091 xxx" }
input.template(person)
returns:
insert $person isa person, has phone-number "+44 091 xxx";
But those were entities. To insert a relation we need to use match
clause to find related instances first.
For example, to insert an instance of a contract
type we have the following template:
inputs.add(new Input("data/contracts") {
@Override
public String template(Json contract) {
// match company
String typeQLInsertQuery = "match $company isa company, has name " + contract.at("company_name") + ";";
// match person
typeQLInsertQuery += " $customer isa person, has phone-number " + contract.at("person_id") + ";";
// insert contract
typeQLInsertQuery += " insert (provider: $company, customer: $customer) isa contract;";
return typeQLInsertQuery;
}
Given the contract item is:
{ company_name: "Telecom", person_id: "+00 091 xxx" }
input.template(contract)
returns:
match $company isa company, has name "Telecom"; $customer isa person, has phone-number "+00 091 xxx"; insert (provider: $company, customer: $customer) isa contract;
Finally, to insert an instance of a call
type we have the following template:
inputs.add(new Input("data/calls") {
@Override
public String template(Json call) {
// match caller
String typeQLInsertQuery = "match $caller isa person, has phone-number " + call.at("caller_id") + ";";
// match callee
typeQLInsertQuery += " $callee isa person, has phone-number " + call.at("callee_id") + ";";
// insert call
typeQLInsertQuery += " insert $call(caller: $caller, callee: $callee) isa call;" +
" $call has started-at " + call.at("started_at").asString() + ";" +
" $call has duration " + call.at("duration").asInteger() + ";";
return typ
Given the call item is:
{ caller_id: "+44 091 xxx", callee_id: "+00 091 xxx", started_at: 2018-08-10T07:57:51, duration: 148 }
input.template(call)
returns:
match $caller isa person, has phone-number "+44 091 xxx"; $callee isa person, has phone-number "+00 091 xxx"; insert $call(caller: $caller, callee: $callee) isa call; $call has started-at 2018-08-10T07:57:51; $call has duration 148;
Python
To migrate the sample dataset into the TypeDB database phone_calls
follow the steps below. Alternatively,
clone the TypeDB repository and follow the readme file
in the telecom/phone_calls/python
directory.
Copy the dataset
Create a new directory on the local machine.
Choose one of the input formats, download the files, and place them in the new directory inside a subdirectory named
data
.
Copy the source codes
Copy the following source files from the telecom/phone_calls
directory in the
TypeDB Examples repository into the new directory:
-
migrate_csv.py
-
migrate_json.py
-
migrate_xml.py
-
queries.py
-
requirements.txt
Start migration
Use pip install -r requirements.txt
to install required dependencies.
Use one of the following commands, depending on the chosen input format:
-
CSV
-
JSON
-
XML
python migrate_csv.py
python migrate_json.py
python migrate_xml.py
Application logic
Data migration code is located inside migrate_{format}.py
file, where {format} is csv
, json
, or xml
.
The build_phone_call_graph()
function initializes TypeDB Python Driver, opens a session to the TypeDB database, and
for every element of inputs
calls load_data_into_typedb()
function.
The load_data_into_typedb()
function calls for parse_data_to_dictionaries()
to parse through the dataset file and
then iterates through parsed records of the file. For every record it opens a transaction, forms an insert query by
calling the template function, then sends the insert query and commits the transaction.
The rest of the file is:
-
the parsing function
parse_data_to_dictionaries()
, -
the
Inputs
list, -
and the template functions.
The parsing function parses data from a dataset file into a list of dictionaries. The implementation of the function is different for every input format.
The Inputs
is inserted as inputs
argument and contains a list of dictionaries with file
field containing
filename (without an extension) and template
field, containing a reference to the template function to process
the file.
The template functions are the most important part, as they are needed to create TypeQL insert queries from information parsed from dataset files.
Insert query examples
To insert an instance of a company
type we have the following template:
def company_template(company):
return 'insert $company isa company, has name "' + company["name"] + '";'
Example:
Given the company
item is:
{ "name": "Telecom" }
Returns:
insert $company isa company, has name "Telecom";
To insert an instance of a person
type we have the following template:
def person_template(person):
# insert person
typeql_insert_query = 'insert $person isa person, has phone-number "' + person["phone_number"] + '"'
if "first_name" in person:
# person is a customer
typeql_insert_query += ", has is-customer true"
typeql_insert_query += ', has first-name "' + person["first_name"] + '"'
typeql_insert_query += ', has last-name "' + person["last_name"] + '"'
typeql_insert_query += ', has city "' + person["city"] + '"'
typeql_insert_query += ", has age " + str(person["age"])
else:
# person is not a customer
typeql_insert_query += ", has is-customer false"
typeql_insert_query += ";"
return typeql_insert_query
Example:
Given the person
item is:
{ "phone_number": "+44 091 xxx" }
Returns:
insert $person isa person, has phone-number "+44 091 xxx", has is-customer false;
or:
Given the person
item is:
{ "first_name": "Jackie", "last_name": "Joe", "city": "Jimo", "age": 77, "phone_number": "+00 091 xxx"}
Returns:
insert $person isa person, has phone-number "+00 091 xxx", has is-customer true, has first-name "Jackie", has last-name "Joe", has city "Jimo", has age 77;
To insert an instance of a contract
type we have the following template:
def contract_template(contract):
# match company
typeql_insert_query = 'match $company isa company, has name "' + contract["company_name"] + '";'
# match person
typeql_insert_query += ' $customer isa person, has phone-number "' + contract["person_id"] + '";'
# insert contract
typeql_insert_query += " insert (provider: $company, customer: $customer) isa contract;"
return typeql_insert_query
Example:
Given the contract
item is:
{ "company_name": "Telecom", "person_id": "+00 091 xxx" }
Returns:
match $company isa company, has name "Telecom"; $customer isa person, has phone-number "+00 091 xxx"; insert (provider: $company, customer: $customer) isa contract;
To insert an instance of a call
type we have the following template:
def call_template(call):
# match caller
typeql_insert_query = 'match $caller isa person, has phone-number "' + call["caller_id"] + '";'
# match callee
typeql_insert_query += ' $callee isa person, has phone-number "' + call["callee_id"] + '";'
# insert call
typeql_insert_query += " insert $call(caller: $caller, callee: $callee) isa call; $call has started-at " + call["started_at"] + "; $call has duration " + str(call["duration"]) + ";"
return typeql_insert_query
Example:
Given the call
item is:
{ "caller_id": "+44 091 xxx", "callee_id": "+00 091 xxx", "started_at": 2018-08-10T07:57:51, "duration": 148 }
Returns:
match $caller isa person, has phone-number "+44 091 xxx"; $callee isa person, has phone-number "+00 091 xxx"; insert $call(caller: $caller, callee: $callee) isa call; $call has started-at 2018-08-10T07:57:51; $call has duration 148;
Node.js
To migrate the sample dataset into the TypeDB database phone_calls
follow the steps below. Alternatively, clone the
TypeDB Examples repository and follow the readme file
in the telecom/phone_calls/nodejs
directory.
Copy the dataset
Create a new directory on the local machine.
Choose one of the input formats, download the files, and place them in the new directory inside a subdirectory
named data
.
Copy the source codes
Copy the following source files from the telecom/phone_calls
directory in the
TypeDB Examples repository into the new directory:
-
migrateCsv.js
-
migrateJson.js
-
migrateXml.js
-
queries.js
-
yarn.lock
Start migration
Use npm install
to install required dependencies.
Use one of the following commands, depending on the chosen input format:
-
CSV
-
JSON
-
XML
npm run migrateCsv
npm run migrateJson
npm run migrateXml
Application logic
Data migration code is located inside migrate{format}.js
file, where {format} is Csv
, Json
, or Xml
.
The buildPhoneCallGraph()
function initializes TypeDB Node.js Driver, opens a session to the TypeDB database, and
for every element of inputs
calls loadDataIntoTypeDB()
function.
The loadDataIntoTypeDB()
function calls for ` parseDataToObjects()` to parse through the dataset file and then
iterates through parsed records of the file. For every record, it opens a transaction, forms an insert query by
calling the template function, then sends the insert query and commits the transaction.
The rest of the file is:
-
the parsing function
parseDataToObjects()
, -
the
Inputs
list, -
and the template functions.
The parsing function parses data from a dataset file into a list of dictionaries. The implementation of the function is different for every input format.
The Inputs
is inserted as the inputs
argument and contains a list of dictionaries with file
field containing
filename (without an extension) and template
field, containing a reference to the template function to process the
file.
The template functions are the most important part, as they are needed to create TypeQL insert queries from information parsed from dataset files.
Insert query examples
To insert an instance of a company
type we have the following template:
function companyTemplate(company) {
return `insert $company isa company, has name "${company.name}";`;
}
Example:
Given the company item is:
{ name: "Telecom" }
Returns:
insert $company isa company, has name "Telecom";
To insert an instance of a person
type we have the following template:
function personTemplate(person) {
const { first_name, last_name, phone_number, city, age } = person;
// insert person
let typeqlInsertQuery = `insert $person isa person, has phone-number "${phone_number}"`;
const isNotCustomer = typeof first_name === "undefined";
if (isNotCustomer) {
// person is not a customer
typeqlInsertQuery += ", has is-customer false";
} else {
// person is a customer
typeqlInsertQuery += `, has is-customer true`;
typeqlInsertQuery += `, has first-name "${first_name}"`;
typeqlInsertQuery += `, has last-name "${last_name}"`;
typeqlInsertQuery += `, has city "${city}"`;
typeqlInsertQuery += `, has age ${age}`;
}
typeqlInsertQuery += ";";
return typeqlInsertQuery;
}
Example:
Given the person
item is:
{ phone_number: "+44 091 xxx" }
Returns:
insert $person isa person, has phone-number "+44 091 xxx";
or:
Given the person
item is:
{ first_name: "Jackie", last_name: "Joe", city: "Jimo", age: 77, phone_number: "+00 091 xxx"}
Returns:
insert $person isa person, has phone-number "+44 091 xxx", has first-name "Jackie", has last-name "Joe", has city "Jimo", has age 77;
To insert an instance of a contract
type we have the following template:
function contractTemplate(contract) {
const { company_name, person_id } = contract;
// match company
let typeqlInsertQuery = `match $company isa company, has name "${company_name}"; `;
// match person
typeqlInsertQuery += `$customer isa person, has phone-number "${person_id}"; `;
// insert contract
typeqlInsertQuery +=
"insert (provider: $company, customer: $customer) isa contract;";
return typeqlInsertQuery;
}
Example:
Given the contract
item is:
{ company_name: "Telecom", person_id: "+00 091 xxx" }
Returns:
match $company isa company, has name "Telecom"; $customer isa person, has phone-number "+00 091 xxx"; insert (provider: $company, customer: $customer) isa contract;
To insert an instance of a call
type we have the following template:
function callTemplate(call) {
const { caller_id, callee_id, started_at, duration } = call;
// match caller
let typeqlInsertQuery = `match $caller isa person, has phone-number "${caller_id}"; `;
// match callee
typeqlInsertQuery += `$callee isa person, has phone-number "${callee_id}"; `;
// insert call
typeqlInsertQuery += `insert $call(caller: $caller, callee: $callee) isa call; $call has started-at ${started_at}; $call has duration ${duration};`;
return typeqlInsertQuery;
}
Example:
Given the call
item is:
{ caller_id: "+44 091 xxx", callee_id: "+00 091 xxx", started_at: 2018-08-10T07:57:51, duration: 148 }
Returns:
match $caller isa person, has phone-number "+44 091 xxx"; $callee isa person, has phone-number "+00 091 xxx"; insert $call(caller: $caller, callee: $callee) isa call; $call has started-at 2018-08-10T07:57:51; $call has duration 148;