If you are like me, you absolutely hate to write the database code needed to upgrade existing customer database installations to the newest database schema.
After looking around in the Internet, I came to the conclusion that there are no free and good solutions (actually, I did not find any commercial ones either) that I can use to support these upgrades. So I had to write my own
Before delving any deeper into the architecture of the solution, I have to mention that the upgrade framework I'm going to show you does not include support for triggers, views, or any fancy constraints (like foreign key constraints that are not really supported in SQLite anyway). So, if you really want to use it to your advantage, I'd suggest you either drop them from your schema, or extend the framework to support them.
The framework does support simple tables and indexes pretty well, which I think will be enough to make this framework useful to most developers.
So, without further ado, let’s start with some definitions.
- Database Schema - The meta-data that describes the structure of a given database without considering the actual data it contains. For example, a database schema describes the fields of a table or the columns of an index.
- Database Upgrade Module - A module that knows how to upgrade a database that has an X database schema and convert it to a database that has a Y database schema.
- Upgrade Path - A list of modules that, when applied successively, will upgrade a database from the source schema to the target schema. For example, if we have an upgrade module that knows to upgrade from schema X to schema Y, and another upgrade module that knows how to upgrade from schema Y to schema Z, then we can apply them successively in order to upgrade from schema X to schema Z.
- Optimal Upgrade Path - All the upgrade modules in the system form a kind of an upgrade graph where every node is an upgrade module, and the arcs connect one upgrade module to another if the destination schema of one upgrade module is the source schema of another upgrade module. In order to shorten the upgrade time, we always want to find the shortest path in the upgrade graph. In my framework, I use a simple BFS algorithm for this purpose.
The following picture illustrates the definitions above:
In this picture, we see that originally, the database went through three separate schema upgrades (T, R, and W) before changing to the final Z schema. At this point, the developers wanted to make the transition from schema X to schema Z faster for existing clients, and so they chose to write a separate upgrade module that knows how to upgrade from schema X to schema Z directly.
The upgrade path from X to Z using the x->z upgrade module (marked with orange color) is the optimal upgrade path because it involves the minimal number of upgrade modules.
The upgrade framework supports two vital operations in order to make it useful:
- Loading all upgrade modules into memory and creating the upgrade graph in memory. In this phase, every upgrade module is queried for its ‘FROM’ schema and its ‘TO’ schema. Every schema is assigned a separate identifier, and a graph is constructed with these identifiers as its nodes.
- Doing the DB upgrade. This is done by executing the following actions:
- Loading and parsing the database schema of the database file we want to upgrade.
- Find the optimal upgrade path (the list of upgrade modules) that is needed in order to upgrade from the database schema of the database file to the schema that was specified as the destination schema.
- Clone the original database file into a temporary file (this will prevent changes to the original database in case the upgrade fails).
- Run every upgrade module in the upgrade path. For every upgrade module, verify that the database schema was indeed changed to the ‘TO’ schema defined in the upgrade module after it finished running.
- Once all upgrade modules finish running, vacuum the database in order to defragment it.
- Rename the temporary database file to have the name of the original database file and delete the original database file.
The following diagrams illustrate the key players in the framework architecture:
The basic building block of the upgrade process is the
IDbUpgrader interface which represents a single upgrade module that knows how to upgrade a database file that has a ‘From’ schema to a database file that has a ‘To’ schema. Every upgrade module also has a
Name property that is used during the process for progress notifications.
public interface IDbUpgrader
event DbUpgradeProgressEventHandler DbUpgradeProgress;
void Upgrade(string dbPath);
Every upgrade module supports the
Upgrade method and a
Cancel method that are called from the upgrade manager module. It also supports progress notifications via the
All automatic upgrade code is located in the
BasicDbUpgrader class that implements the
IDbUpgrader class and provides various hooks that can be used by derived classes in order to override some (or all) of the decisions made by the basic upgrade module code.
The default behavior of the basic database upgrade module class is to allow all actions that are needed in order to transform the database file schema to its ‘To’ schema. In order to support fine-tuning where needed, the basic upgrade module provides the following hooks:
AllowIndexAddition – returning
false will instruct the upgrade module not to add the index whose name is specified in the call. This allows the concrete class to add the index later when its
DoSpecificUpgrades method is called.
AllowIndexDeletion – returning
false will instruct the upgrade module not to delete the index whose name is specified in the call. This allows the concrete class to delete the index later when its
DoSpecificUpgrades method is called.
AllowTableAutoUpgrade – returning
false will instruct the upgrade module not to do automatic upgrade to a table. This allows the concrete class to handle the upgrade issue later when its
DoSpecificUpgrades method is called. The basic upgrade module will try to choose the least costly way to upgrade a table –
ALTER TABLE if possible, or a complete re-write of the existing table otherwise.
AllowTableCreation – returning
false will instruct the upgrade module not to create the table whose name is specified in the call. Note that a new table is created without any data. If you need to add rows to such tables, you can use the
DoSpecificUpgrades hook. Note that when a table is created, all of its indexes are created as well so you won’t get separate
AllowIndexAddition hook calls.
AllowTableDeletion – returning
false will instruct the upgrade module not to delete the table whose name is specified in the call. Note that when a table is deleted, all of its indexes are deleted as well so you won’t get separate
AllowIndexDeletion hook calls.
AllowTableReplacement – The upgrade module deems that a table should be replaced if all of its original columns were replaced by other columns. Replacing a table is the same as dropping it and than creating it again (according to the new schema). All previous information in the table is deleted.
DoSpecificUpgrades – After all automatic upgrade operations are finished, the basic upgrade module will call this method in order to provide the concrete upgrade module class with a hook for doing more elaborate upgrade operations if necessary. For example, sometimes, we don’t want the upgrade module to delete a table because we want to use its data in order to populate some other table. In such a case, we’ll prevent the deletion of that table by returning
false in the
AllowTableDeletion hook, and we’ll put code in the
DoSpecificUpgrades method that will copy the contents of the table and in the end delete it.
Aside from supporting these hooks, the basic upgrade module also provides some helper methods for concrete upgrade module classes that derive from it:
CreateIndex – Requests the creation of the index specified in the arguments of the method.
CreateTable – Requests the creation of the table specified in the arguments of the method.
DropIndex – Requests the deletion of the index specified in the arguments of the method.
DropTable – Requests the deletion of the table specified in the arguments of the method
ReplaceIndex – Requests replacing an existing index by the specified index.
When parsing SQL schema files or the embedded database schema (using the SQLITE_MASTER table), the upgrade framework uses the facilities of the four classes shown below besides the basic upgrade module class:
DbSchema class – encapsulates the structure of the database. Currently, it does not support views or triggers, only tables and indexes.
DbTable class – encapsulates the structure of a database table. It contains the list of columns, the list of primary keys, and the name of the table.
DbColumn class – encapsulates the structure of a single database table column. It contains the name of the column, its type and size descriptors, and the various constraints of the column. As explained earlier, not all constraints are supported, so if you want to work with the upgrade framework, you’ll need to remove those constraints that are incompatible (
CHECK constraints are not supported,
REFERENCE constraints are not supported,
ON CONFLICT clauses are not supported as well).
The upgrade manager class is the central class that is responsible for coordinating all upgrade operations. It supports the following methods:
SetUpgraders – Here, you provide the list of upgrade modules that you want to support. The method will construct an upgrade graph internally. This graph will be used to construct an upgrade path later.
UpgradeDB – Here, you provide the path to the DB file you want to upgrade and the target schema to which you want to upgrade. Note that the target schema must be supported by one of your upgrade modules (otherwise, the upgrade manager will not be able to construct an upgrade path).
Cancel – Called in case the user wants to cancel the upgrade process in the middle. It will call the
Cancel method of all relevant upgrade modules and stop the upgrade process.
During the upgrade process, the upgrade manager class will issue progress events via the
DbUpgradeProgress event. These can be handled by your GUI for displaying the progress of the upgrade to the user.
Using the code
In order to make it easier to use the code, I've attached a sample application that includes the upgrade engine library. The attached solution has the following structure:
- DB - contains the latest SQL schema definition file for the database.
- Libs - contains the SQLite .NET provider DLL.
- DbUpgraders - a class library project that contains sample database upgrade classes.
- SchemaExtractor - a utility application to extract the SQL schema definition of SQLite databases. The SQL schema files are then used as embedded resources in the upgrade utility (see the code in the DbUpgraders library).
- TestApp - a skeletal upgrade utility application that can be used, with a few modifications on your part, in order to perform the necessary database upgrades.
- UpgradeEngine - the database upgrade framework (a class library).
- DbVersions - a folder that contains a historical version of the database for each supported version. This folder is very useful when debugging database upgrades.
I've tried to explain the internal structure and use of the SQLite database upgrade framework as much as I can, but I'm not perfect. If you think my explanations are lacking, please drop me an email with your questions, and I'll do my best to answer them as soon as I can.
- 02 February 2009 - Version 1.0.