Choose the Right Model: Comparing Relational, Document, and Graph Databases

What exactly is a database model? A database model shows the logical structure of a database, including the relationships and constraints that determine how data can be stored and accessed.

Individual databases are designed based on the rules and concepts of the broader data model the designers adopt. Developers often use these models to strategically design databases for larger-scale real-world projects such as:

Order Management Systems (ERPs). A typical database for a company that—for example—sells building materials might be arranged as follows: each customer in the database is assigned one or more addresses, one or more contact phones, and a default salesperson who acts as the liaison between the customer and the company. Purchase orders are stored in a separate table, recording the kinds of purchased materials, quantities, prices, and the customer responsible for the order.
Healthcare-Related Systems (EMRs). A healthcare provider has multiple offices in many different states. Many doctors work for the company, and each doctor takes care of multiple patients. Some doctors just work in one office, while others work in different offices on different days. In this case, a database table keeps track of information about each doctor, such as name, address, contact methods, area of specialization, and fee structure. Another table might keep records of patients, their medical history, visits, and preferred doctors. In a third table, the healthcare provider might store information about appointments—recording the identities of the patient and doctor, the time and location of the appointment, and the cost.
Scientific Research. Genome and related research areas in molecular and cellular biology are perhaps good examples of use cases for a scientific database. These databases might contain gene catalogs for completely sequenced genomes and some partial genomes, genome maps and organism information, and data about sequence similarities among all known genes in all organisms in the database. It also contains information on molecular interaction networks in the cell and chemical compounds and reactions.

To understand the high-level data structure behind modern software, we will examine three types of models:

Relational
Document
Graph

Each model comes with an example company that is using it. As a bonus, scroll to the end to read about the now-outdated network model.

Relational Database Model

The closest analogue to the relational database is the spreadsheet. But while spreadsheets store linkages between individual cells or rows, relational database store links between data.

Orders from an online store are a canonical example of what a relational database model is good for. Customers have orders, orders have amounts, amounts have currencies, and so forth. These relationships can be expressed with an Entity Relationship Diagram that visually describes how each table links to another.

The benefit of this relationship between tables is that when customer data changes, this data only needs to be updated in one place. For example, if a customer obtains a new email address, a single row can be changed and all orders for that customer automatically relate to the new email address.

Relational databases work well for transactional data with a rigid model. Other types of data such as a product catalog which can change frequently and has many combinations of colors and sizes is better suited for a different data model.

Document Database Model

Whereas an ever-changing data model like a product catalog might not be a fit for relational databases, it is a wonderful fit for document databases. Most of these databases use a nested key/value store like JSON for recording information and we will consider key/value and document data models under the same umbrella.

If you're familiar with JSON, you know how flexible it is. These loose constraints extend to document databases. You can start with a few items, or keys, and add to them over time, deeply nesting them if necessary.

Each key points to a document that contains many values. A document could be an actual document, like a 2000 word blog post, or a number, or a link, or anything. Unlike JSON, most document databases will be strictly typed so that you will need to specify if you're working with an integer or float.

The problem with document databases is that you will not get the same guarantees as you would on a relational database. So you may occasionally see duplicated data and you may struggle to evolve the data model once it gets large. Often times, companies will end up using a mix of document and relational databases. Some relational databases even offer great support for JSON so you do not need to choose.

Example Document Database Model: eBay

MongoDB and Redis are well-known document databases. eBay is a typical example of an e-commerce platform that relies heavily upon the document model. The image below provides a rough example of how the eBay database system works:

Graph Database Model

Graph databases represent each object as connected to each other. This is useful for data like the financial connections between companies, social media, genomics, and other types of data that are well suited for a graph structure. One of the most well-known graph databases is neo4j, which was developed in the 2000s.

To create a graph database, you will have to import your data. Let’s say you have items A, B, and C. If A is connected to B, B is connected to C, and C is connected to A you could represent the data like this.

A, B
B, C
C, A

The graph created from this dataset would be completely connected. We could use the graph database to ask questions like, “How can I get from A to C?” The database would return an answer like “go from A to B, then go from B to C." Note that if we asked “How can I get from C to A?” the database would simply return “go from C to A.”

This is an abstract example. To imagine a real query on a graph database, think of the newsfeeds of various social media apps. The application must request the connections to you, the connections to your friends, posts your friends have liked, and likes on your own posts.

Example Graph Database: Facebook

Facebook has - at last glance - implemented its own graph database called Tao. The diagram above shows just how complicated the connections of a simple post can be. Now imagine an entire newsfeed.

Tao supports a couple of interesting query modes. Point queries are essential IF statements: whether or not one object is connected. Range queries can be used to find things like recent posts. And count queries are like LIST: they get all the items related to the object at hand. Pretty handy if you need to parse the social graph!

Choosing the Right Model

In the article, we explored how to use the relational, document, and graph database models. We looked at how each one worked and what factors to consider when deciding which option works best for your project. Since each has benefits and drawbacks, you'll need to think carefully when deciding on which direction you want to go before you start building.

From the Archives: Network Database Models

The network database model allows multiple records to be linked to the same owner file. The model can be thought of as an inverted tree, where the branches are member information and the owner of those members is the root. This makes for a lot of design flexibility because there are multiple linkages between nodes. Given that, we would consider most nodes in the database to have a many-to-many relationship with each other.

The network database model is no longer widely used, having been replaced by the relational model in the 1970s.

Some database systems that use the network database model include IMAGE for HP 3000, Integrated Data Store, IDMS (integrated database management system), and the Oracle CODASYL DBMS for OpenVMS.

Example Network Database Model: US Army

The US Army and Boeing implemented HP 3000s and still use them today, even though the end of life was 2010. Oracle's network databases are also still used and supported. Review the diagram below and compare it to the diagram of a graph based database. What's different?

Oracle 10g Network Database.