Click here to Skip to main content
12,510,238 members (54,031 online)
Click here to Skip to main content
Add your own
alternative version

Stats

67.7K views
32 bookmarked
Posted

Audit Trail – Tracing Data Changes in Database

, 30 Aug 2010 Public Domain
Rate this:
Please Sign up or sign in to vote.
Logging of data changes in relational databases

Introduction

The common requirement in many enterprise applications is logging of data changes in a database - what data has changed, who changed them and when (audit logging). Bloggers published many articles about this old problem but there are only few general approaches on how to do that in relational databases.

Promiscuous Women Cause Problems for Database Administrators

This is a fiction story that never happened:

Juliet Capulet meets Paul Smith; they fell in love and marry on 25th August 2009. When Juliet is back at work, she asks the db admin to change her name in a company database. The admin is not happy to have additional work with this stuff but he does it for her.

Juliet's and Paul's marriage is not going well and Juliet divorces Paul after a short period. Soon Juliet feels alone, she meets Romeo Montague and, of course, falls in desperate love with him. Logically, another wedding is arranged and Juliet asks db admin to change her name again on 10th January 2010.

The tragedy of all Juliet's love relationships finishes when she finds her husband Romeo as a roulette player spending all their money. She divorces again and changes her name to Juliet Singleton on 21st July 2010 with the vision not to marry anybody again. Needless to say, the db admin is close to committing suicide because his database does not allow changing the name more than twice.

What would you suggest the database admin to do better in his next life?

Problem Description

Let us summarize what the db admin was asked to do with Juliet's personal data:

  • Juliet Capulet gets the job on 2005-03-01
  • Name Juliet Capulet changes to Juliet Smith on 2009-08-25
  • Name Juliet Smith changes to Juliet Montague on 2010-01-10
  • Name Juliet Montague changes to Julie Singleton on 2010-07-21

Solution No. 1: Roll off Out-dated data

The simplest solution for keeping historical data in the database is to create a duplicated record before update. After that, we can update the requested data. In case of tracing Juliet's name, we get these data in the database table:

idnamecreated_datecreated _by
124Juliet Capulet2005-03-01admin
124Juliet Smith2009-08-25admin
124Juliet Montague2010-01-10admin
124Juliet Singleton2010-07-21admin

In real database schema, id is primary key and has to be unique. So we need an extra column containing information about row id from which data were rolled off. Additionally, it would be helpful to have a column representing status of the record. Because the situation gets more complex, here I show how data evolves in the database:

Juliet Capulet gets the job on 2005-03-01.

idoriginal_idstatusnamecreated_datecreated _by
124124activeJuliet Capulet2005-03-01admin

Name Juliet Capulet changes to Juliet Smith on 2009-08-25.

idoriginal_idstatusnamecreated_datecreated _by
124124activeJuliet Smith2009-08-25admin
254124historyJuliet Capulet2005-03-01admin

Name Juliet Smith changes to Juliet Montague on 2010-01-10.

idoriginal_idstatusnamecreated_datecreated _by
124124activeJuliet Montague2010-01-10admin
254124historyJuliet Capulet2005-03-01admin
347124historyJuliet Smith2009-08-25admin

Name Juliet Montague changes to Julie Singleton on 2010-07-21.

idoriginal_idstatusnamecreated_datecreated _by
124124activeJuliet Singleton2010-07-21admin
254124historyJuliet Capulet2005-03-01admin
347124historyJuliet Smith2009-08-25admin
489124historyJuliet Montague2010-01-10admin

You can see that each change to the record with id=124 invokes:

  1. creating duplicated row with status=history and
  2. updating the column name to a requested value with created_date set to current date.

When the active record is going to be "deleted", then only the status flag changes the value from active to history.

Advantages

It is simple to implement audit logging based on rolling out old data. You do not need additional tables. Maintenance is simple, too. If you decide to remove old data, then it is the matter of one SQL.

Disadvantages

But what if you need to implement audit logging in more tables? You need to add columns original_id, status, created_date and created_by. And you have to implement previously introduced roll-off logic, either to database directly (usually with table triggers) or in the application.

The problem can be that when data changes, then the whole record is copied, i.e. also data which does not change. It causes data duplication and, potentially, database disk space can increase markedly if changes occur frequently. For example, if table persons has a column photo with binary data containing the photography then each and every time Juliet's name changes the whole record (including the photo) is copied to rolled-off record.

Another disadvantage is that the complexity of each table supporting audit logging increases. You must have in mind all the time that retrieving the records is not simple. You always have to use the SELECT clause with condition:

SELECT * FROM persons WHERE status='active'

When you need to roll back historical data for a particular record, you have to copy all columns from the historical record to the active record. It would be easier just to switch status value from "history" to "active", but it would break reference integrity if the record was referenced in other tables.

Solution No. 2: Dedicated Data-Tracing Table

Another approach is based on separate audit log table that is dedicated to logging data changes coming from all tables having audit log feature:

idtablecolumnrowchanged_datechanged_byold_valuenew_value
1241personsname1242005-03-01adminJuliet CapuletJuliet Capulet
1654personsname1242009-08-25adminJuliet CapuletJuliet Smith
2547personsname1242010-01-10adminJuliet SmithJuliet Montague
3645personsname1242010-07-21adminJuliet MontagueJuliet Singleton

Advantages

This solution separates the concerns much better. There is only one common data store for all historical data. Further, you have to implement audit log feature only once for all tables which should support it.

Compared to the previous solution, when the record changes, only the changed data are logged. Thus it is expected to be a little bit faster but I did not do any tests so I cannot prove it exactly. The real advantage is that you need lower disk space to store data changes. You can improve the table structure when you normalize the log table and you put just integer references to "table" and "column".

You can easily retrieve user activity just by the query

SELECT * FROM audit_log WHERE changed_by='admin'

Disadvantages

All data changes are logged in a common table so old and new values stored there must be of some generic type. For example, in Microsoft SQL, it can be "sql_variant". Or you can convert the values to "varchar" before storing them in the log table.

Rollback is a rather difficult task to be implemented. If you want to roll data back to 3rd March 2010, you have to use query:

SELECT [table], [column], old_value, new_value FROM audit_log WHERE _
  [table]='persons' AND row=124 and changed_date > '2010-03-01' ORDER BY changed_date

and then you iterate through the result set and build the dynamic query because table and column information is in the result. It is more complicated compared to the previous solution. Fortunately, the feature has to be implemented only once and then can be reused across the whole application.

History

  • 30th August, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication

Share

About the Author

Petr Kozelek
Software Developer (Senior)
Czech Republic Czech Republic
No Biography provided

You may also be interested in...

Pro
Pro

Comments and Discussions

 
GeneralTaking the next step... Pin
nigelren31-Aug-10 22:00
membernigelren31-Aug-10 22:00 
GeneralRe: Taking the next step... Pin
Petr Kozelek1-Sep-10 4:06
memberPetr Kozelek1-Sep-10 4:06 
GeneralRe: Taking the next step... Pin
nigelren1-Sep-10 7:34
membernigelren1-Sep-10 7:34 
GeneralRe: Taking the next step... Pin
Jerry Jeremiah29-Sep-10 12:35
memberJerry Jeremiah29-Sep-10 12:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.160929.1 | Last Updated 30 Aug 2010
Article Copyright 2010 by Petr Kozelek
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid