|
|||||||||||||||||||||
|
|||||||||||||||||||||
|
Announcements
Want a new Job?
Chapters
Services
Feature Zones
|
Note: This is an unedited contribution. If this article is inappropriate,
needs attention or copies someone else's work without reference then please
Report This Article
IntroductionMaintaining a version history of SQL Server data has many benefits, but the top three are:
Record versioning imposes a layer on top of CRUD operations that make them more complex, especially on Update and Delete. No longer can you simply delete a record. Instead, you must flag it as deleted ("soft delete"). No longer can you simply update a record; instead, you must perform a soft delete followed by an insert. Reporting is also a challenge. Most reporting frameworks do not understand the concept of versioned data. They expect each record to be a distinct data item, not a 'version' of a data item it has already seen. No versioning solution will circumvent the fundamental challenges to versioned records, but we can greatly improve on the traditional approach to auditing data. The typical solutionRecord versioning is normally accomplished by creating separate audit tables that mirror the schema of the main tables.
Does that look right to you? It has a few bad smells to me.
The better planAs it turns out, we indeed can do much, much better! By using some clever entity inheritance, we can solve the audit problem for all tables in the database instead of just one at a time. The solution involves a base audit table from which all auditable tables inherit. Study the following diagram and then look at the notes below.
CRUD OperationsLet's try out a few CRUD operations to see how this new approach feels. Create (Insert) and Read (Select) Entity inheritance generally requires two more more insertions because you must insert into multiple tables for one complete 'record'. In this case, insertion now involves two operations. First, you must insert a record into the base Audit table. Then, you must get the PK ID of that inserted record for use with the second insertion into the Blog table. Multiple insertions per operation is one drawback to the entity inheritance strategy, but it can be encapsulated. For example, the following insertion sample could be converted into a stored procedure that takes the Blog table values and the value for Audit.Updated_By. /* Setup */ delete from blog delete from comment delete from audit declare @id bigint /* ------- */ /* Blog insert */ insert into Audit(Updated_By) values ('Ben') select @id = @@identity insert into Blog (Id, UserId, Title, Body) values ( @id, 1, 'My first entry', 'My blog body') select * from Blog join Audit on Blog.Id = Audit.Id where IsActive=1
As you can see, the Audit table kicked right in and did its job. We have a PermanentRecordId for this blog entry, and all other information is intact. Notice also that the Id columns are synchronized, and the record is marked as active. While it's true that you must perform an INNER JOIN with a WHERE to select anything using this strategy, the operation is not costly because it is performed on indexed fields. Update In versioned recording, an Update is really a soft delete followed by an insert. Soft deletes are performed directly against the audit table. This is one nice feature: to perform a soft delete, you don't even need to know the record type. This means you could make a single stored procedure, spSoftDelete(id) that accepts the ID of the record to soft delete. /* Blog update */ declare @BlogRecordId uniqueidentifier select @BlogRecordId=PermanentRecordId from Audit where Id = @id update Audit set IsActive=0 where Id=@id insert into Audit(Updated_By, PermanentRecordId) values ('Ben', @BlogRecordId) insert into Blog (Id, UserId, Title, Body) values ( @@identity, 1, 'My updated first entry', 'My blog body is different now') select * from Blog join Audit on Blog.Id = Audit.Id Now we have two entries with the same PermanentRecordId. That means they are different versions of the same logical record. But notice that the old record is no longer active. We could select MAX(Created), but an active flag is faster, and as you will see for Undo operations, necessary. Delete Deletes retire all versions of a record. This can be performed on the Audit table alone, making it easy to encapsulate. /* Blog delete */ update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId select * from Blog join Audit on Blog.Id = Audit.Id Alternately, if you have the exact Blog.Id or Audit.Id, you can set IsActive that way. As long as all your Update operations are done correctly, there should be only one record where IsActive=1. You could even set a constraint to check this. CRUD ExtensionsVersioning opens the possibility of extended operations, such as undo/redo. To use these CRUD extensions, add two more versions for a total of 4 blog entries. /* Setup for CRUD extensions */ insert into Audit(Updated_By, PermanentRecordId) values ('Ben', @BlogRecordId) insert into Blog (Id, UserId, Title, Body) values ( @@identity, 1, 'My 3rd updated first entry', 'My blog body is different now') update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId insert into Audit(Updated_By, PermanentRecordId) values ('Ben', @BlogRecordId) insert into Blog (Id, UserId, Title, Body) values ( @@identity, 1, 'My 4th updated first entry', 'My blog body is different now') /* ------------------------- */ Undo/Rollback/Revert An undo operation involves a few steps:
In the sample below, we find the version preceding the active version by relying on primary keys for chronological order. Version #3 will always have a PK ID smaller than version #4 of the same record. Some programmers do not like using indexed primary keys to determine chronological order. If you are among these, you may want to consider using an alternate indexed column to maintain chronological order. (Note: the SQL Server 'timestamp' data type will not work because records are updated when their Active status changes, and this changes the timestamp value). /* Rollback/undo to previous (to 3rd) */ select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId update Audit set IsActive=1 where Id = @id select * from Blog join Audit on Blog.Id = Audit.Id /* Rollback/undo to previous (to 2nd) */ select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId update Audit set IsActive=1 where Id = @id select * from Blog join Audit on Blog.Id = Audit.Id /* Rollback/undo to previous (to 1st) */ select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId update Audit set IsActive=1 where Id = @id select * from Blog join Audit on Blog.Id = Audit.Id /* Rollback/undo to previous (still on 1st) */ select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId and Id < (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id desc update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId update Audit set IsActive=1 where Id = @id select * from Blog join Audit on Blog.Id = Audit.Id
Redo Redo is a simple reversal of the undo operation. Find the first version following the active version, and activate it. /* Redo (to 2nd) */ select top 1 @id=Id from Audit where PermanentRecordId=@BlogRecordId and Id > (select Id from Audit where PermanentRecordId=@BlogRecordId and IsActive=1) order by Id asc update Audit set IsActive=0 where PermanentRecordId=@BlogRecordId update Audit set IsActive=1 where Id = @id select * from Blog join Audit on Blog.Id = Audit.Id
Adding new auditable tablesAdding new tables under our version control system is easy, and the CRUD code you already saw above is easily adapted, especially if you choose to encapsulate in stored procedures. Let's add a table to support versionable, nested blog comments to demonstrate how similar the CRUD is for a new table. Also, notice that the CRUD demonstrated above requires zero changes when a new table is added. Outstanding. At first, supporting multiple records from multiple tables sounds impossibly difficult, but it works with almost no added effort.
So this is getting interesting! Now, comments can have versions just like blog entries, but nothing is ever lost. Let's check our CRUD again to be sure everything works. Create (Insert) The only difference here is that we need to reference the PermanentBlogId. That is a requirement only because Comments are owned by Blogs. Fortunately, we have that from earlier in the demo script. If you had the Blog.Id, you could use that to get the PermanentId of the Blog entry. /* Comment insert */ insert into Audit(Updated_By) values ('Commenter') select @id = @@identity insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment) values ( @id, @BlogRecordId, 2, 'My first comment') select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1
If you study this, you will see that this comment version has a different PermanentRecordId. That is correct because this is a different set of versioned data. Each comment will get its own PermanentId. Update Update is identical excepting different values for the insertion. /* Comment update */ declare @CommentRecordId uniqueidentifier select @CommentRecordId=PermanentRecordId from Audit where Id = @id update Audit set IsActive=0 where Id=@id insert into Audit(Updated_By, PermanentRecordId) values ('Commenter', @CommentRecordId) insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment) values ( @@identity, @BlogRecordId, 1, 'My updated first comment') select * from Comment join Audit on Comment.Id = Audit.Id Select all comments for a blog entry Okay, so you're convinced now that the versioning works. You can see that the rollback and restore operations will be performed on the PermanentId values associated with each comment, and you can see each comment gets its own PermanentId. It's confusing to imagine that both Blog entries and Comments have versions! So, how do you select all the comments for the current version of the blog? Easy. /* Insert 2nd comment */ insert into Audit(Updated_By) values ('Commenter #2') select @id = @@identity insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment) values ( @id, @BlogRecordId, 2, 'I have a comment too') select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1 /* Insert 3rd nested comment */ declare @parentId bigint select @parentId = @id insert into Audit(Updated_By) values ('Commenter #3') select @id = @@identity insert into Comment (Id, PermanentBlogId, CommentingUserId, Comment, ParentId) values ( @id, @BlogRecordId, 3, 'I have a comment about the comment', @parentId) select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1 /* Select all comments for this blog entry */ select * from Comment join Audit on Comment.Id = Audit.Id where IsActive=1 and PermanentBlogId = @BlogRecordId order by Created asc | ||||||||||||||||||||