Introduction
Maintaining a version history of SQL Server data has many benefits, but the
top three are:
- Undo history
- Auditing
- Soft delete
Record versioning imposes a layer on top of CRUD operations that make them
more complex, especially on Update and Delete. No longer can you simply delete a
record. Instead, you must flag it as deleted ("soft delete"). No longer can you
simply update a record; instead, you must perform a soft delete followed by an
insert.
Reporting is also a challenge. Most reporting frameworks do not understand
the concept of versioned data. They expect each record to be a distinct data
item, not a 'version' of a data item it has already seen.
No versioning solution will circumvent the fundamental challenges to
versioned records, but we can greatly improve on the traditional approach to
auditing data.
The typical solution
Record versioning is normally accomplished by creating separate audit tables
that mirror the schema of the main tables.

Does that look right to you? It has a few bad smells to me.
- I don't like the schema duplication. It adds to the clutter of tables,
makes maintenance more difficult, and in general makes it harder for new
developers to digest. I think we can do better.
- It feels unsafe. The fundamental principal of moving data involves
deleting from the old destination. I think we can do better.
- It's not friendly with ORM. If you use ORM tools or handle your audit
trail with business objects, you are forced to copy each field explicitly,
from the old business object to the new 'audit' business object. Again, I
think we can do better.
- It doesn't scale. If I want to add a BlogComment table, I have to add
another audit table. Now not only do we have schema duplication, but we have
duplicate abstractions of auditing that can grow apart over time. Imagine
years later when the BlogComment_Archive table behaves differently than the
Blog_Archive table. That makes it even harder to comprehend the schema.
- I don't want to write reports against this schema. I have to point to
separate tables when I want historical drill-down and that seems
unnecessary.
The better plan
As it turns out, we indeed can do much, much better! By using some clever
entity inheritance, we can solve the audit problem for all tables in the
database instead of just one at a time.
The solution involves a base audit table from which all auditable tables
inherit. Study the following diagram and then look at the notes below.

- The Audit table contains all the version information. Date stamp, active
state, who updated it. Your audit requirements may include other fields
here.
- The Audit table contains a PermanentRecordId. Because the PK of Blog records
will change as you insert new versions, we need a permanent ID to identify a
single blog entry and group versions of the same blog entry.
- The Blog table shares a 1:1 relationship on primary keys. Audit.Id is
the PK and Blog.Id is the FK. Notice that Blog.Id is not an identity
column.
- The Blog table looks much cleaner without all the audit garbage
distracting. When you look at the Blog table, you immediately understand its
purpose. Likewise, the Audit table is immediately obvious.
CRUD Operations
Let's try out a few CRUD operations to see how this new approach feels.
Create (Insert) and Read (Select)
Entity inheritance generally requires two more more insertions because you
must insert into multiple tables for one complete 'record'.
In this case, insertion now involves two operations. First, you must insert a
record into the base Audit table. Then, you must get the PK ID of that inserted
record for use with the second insertion into the Blog table.
Multiple insertions per operation is one drawback to the entity inheritance
strategy, but it can be encapsulated. For example, the following insertion
sample could be converted into a stored procedure that takes the Blog table
values and the value for Audit.Updated_By.
delete from blog
delete from comment
delete from audit
declare @id bigint
insert into Audit(Updated_By) values ('Ben')
select @id = @@identity
insert into Blog (Id, UserId, Title, Body) values ( @id, 1, 'My first entry', 'My blog body')
select * from Blog join Audit on Blog.Id = Audit.Id where IsActive=1

As you can see, the Audit table kicked right in and did its job. We have a
PermanentRecordId for this blog entry, and all other information is intact.
Notice also that the Id columns are synchronized, and the record is marked as
active.
While it's true that you must perform an INNER JOIN with a WHERE to select anything using
this strategy, the operation is not costly because it is performed on indexed
fields.
Update
In versioned recording, an Update is really a soft delete followed by an
insert.
Soft deletes are performed directly against the audit table. This is one nice
feature: to perform a soft delete, you don't even need to know the record type.
This means you could make a single stored procedure, spSoftDelete(id) that
accepts the ID of the record to soft delete.
declare @BlogRecordId uniqueidentifier
select @BlogRecordId=PermanentRecordId from Audit where Id = @id
update Audit set IsActive=0 where Id=@id
insert into Audit(Updated_By, PermanentRecordId) values ('Ben', @BlogRecordId)
insert into Blog (Id, UserId, Title, Body)
values ( @@identity, 1, 'My updated first entry', 'My blog body is different now')
select * from Blog join Audit on Blog.Id = Audit.Id

Now we have two entries with the same PermanentRecordId. That means they are
different versions of the same logical record. But notice that the old record is
no longer active. We could select MAX(Created), but an active flag is faster,
and as you will see for Undo operations, necessary.
Delete
Deletes retire all versions of a record. This can be performed on the Audit
table alone, making it easy to encapsulate.
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
select * from Blog join Audit on Blog.Id = Audit.Id

Alternately, if you have the exact Blog.Id or Audit.Id, you can set IsActive
that way. As long as all your Update operations are done correctly, there should
be only one record where IsActive=1. You could even set a constraint to check
this.
CRUD Extensions
Versioning opens the possibility of extended operations, such as undo/redo. To use these CRUD extensions, add two more versions for a total of 4 blog
entries.
insert into Audit(Updated_By, PermanentRecordId) values ('Ben', @BlogRecordId)
insert into Blog (Id, UserId, Title, Body)
values ( @@identity, 1, 'My 3rd updated first entry', 'My blog body is different now')
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
insert into Audit(Updated_By, PermanentRecordId) values ('Ben', @BlogRecordId)
insert into Blog (Id, UserId, Title, Body)
values ( @@identity, 1, 'My 4th updated first entry', 'My blog body is different now')
Undo/Rollback/Revert
An undo operation involves a few steps:
- Find the version directly preceding the active version, if there is one.
- Soft-delete the active version
- Activate the version you found in step 1
In the sample below, we find the version preceding the active version by
relying on primary keys for chronological order. Version #3 will always have a
PK ID smaller than version #4 of the same record. Some programmers do not like using indexed primary keys to
determine chronological order. If you are among these, you may want to
consider using an alternate indexed column to maintain chronological order.
(Note: the SQL Server 'timestamp' data type will not work because
records are updated when their Active status changes, and this changes the
timestamp value).
select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId
and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
update Audit set IsActive=1 where Id = @id
select * from Blog join Audit on Blog.Id = Audit.Id
select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId
and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
update Audit set IsActive=1 where Id = @id
select * from Blog join Audit on Blog.Id = Audit.Id
select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId
and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
update Audit set IsActive=1 where Id = @id
select * from Blog join Audit on Blog.Id = Audit.Id
select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId
and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
update Audit set IsActive=1 where Id = @id
select * from Blog join Audit on Blog.Id = Audit.Id

Redo
Redo is a simple reversal of the undo operation. Find the first version
following the active version, and activate it.
select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId
and Id > (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id asc
update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId
update Audit set IsActive=1 where Id = @id
select * from Blog join Audit on Blog.Id = Audit.Id

Adding new auditable tables
Adding new tables under our version control system is easy, and the CRUD code
you already saw above is easily adapted, especially if you choose to encapsulate
in stored procedures.
Let's add a table to support versionable, nested blog comments to demonstrate
how similar the CRUD is for a new table. Also, notice that the CRUD demonstrated
above requires zero changes when a new table is added. Outstanding.
At first, supporting multiple records from multiple tables sounds impossibly difficult, but it works with
almost no added effort.

- Comment.Id is an FK to Audit.Id just like Blog.Id
- Comment.PermanentBlogId will store the PermanentId for the blog entry.
You could use an FK directly to Blog.Id, but the downside is that you would
have to update all FK references when the active version of the Blog record
changes. This way, you give up a little referential integrity (that you
could add back with constraints if you wanted to), but you gain simplicity
through decoupled revision changes.
- Comment.ParentId points to Comment.Id to allow for nested
comments
So this is getting interesting! Now, comments can have versions just like
blog entries, but nothing is ever lost. Let's check our CRUD again to be sure
everything works.
Create (Insert)
The only difference here is that we need to reference the PermanentBlogId.
That is a requirement only because Comments are owned by Blogs. Fortunately, we have that from earlier in the
demo script. If you had the Blog.Id,
you could use that to get the PermanentId of the Blog entry.
insert into Audit(Updated_By) values ('Commenter')
select @id = @@identity
insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment)
values ( @id, @BlogRecordId, 2, 'My first comment')
select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1

If you study this, you will see that this comment version has a different
PermanentRecordId. That is correct because this is a different set of versioned
data. Each comment will get its own PermanentId.
Update
Update is identical excepting different values for the insertion.
declare @CommentRecordId uniqueidentifier
select @CommentRecordId=PermanentRecordId from Audit where Id = @id
update Audit set IsActive=0 where Id=@id
insert into Audit(Updated_By, PermanentRecordId) values ('Commenter', @CommentRecordId)
insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment)
values ( @@identity, @BlogRecordId, 1, 'My updated first comment')
select * from Comment join Audit on Comment.Id = Audit.Id

Select all comments for a blog entry
Okay, so you're convinced now that the versioning works. You can see that the
rollback and restore operations will be performed on the PermanentId values
associated with each comment, and you can see each comment gets its own
PermanentId. It's confusing to imagine that both Blog entries and Comments have
versions!
So, how do you select all the comments for the current version of the blog?
Easy.
insert into Audit(Updated_By) values ('Commenter #2')
select @id = @@identity
insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment)
values ( @id, @BlogRecordId, 2, 'I have a comment too')
select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1
declare @parentId bigint
select @parentId = @id
insert into Audit(Updated_By) values ('Commenter #3')
select @id = @@identity
insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment, ParentId)
values ( @id, @BlogRecordId, 3, 'I have a comment about the comment', @parentId)
select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1
select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1 and
PermanentBlogId = @BlogRecordId order by Created asc

Further study
If you are interested in this approach, I recommend looking into the
following advanced topics:
- Using record versioning with your favorite ORM tool
- Using record versioning with code generated DALs
- Entity/table inheritance
- Hierarchical versions (for example, if you wanted a Blog rollback to also
roll Comments back)
- Writing views for reports
- Encapsulating and abstracting insert/update operations
About Benjamin Allfree
Benjamin Allfree runs
Launchpoint Software Inc., a technology-agnostic company specializing in
unparalleled custom software solutions. Benjamin spends most of his time
researching architecture and new technology.