Click here to Skip to main content
Click here to Skip to main content

DeltaScope

, 1 Sep 2009 CPOL
Rate this:
Please Sign up or sign in to vote.
A reusable delta engine with GDI+ front-end controls.

Introduction

This is a candidate entry for The Code Project's Lean and Mean competition to write an efficient text delta tool.

I was checking my emails while on vacation and I came across a Code Project daily mailing with a reminder for the LCS competition. I have enjoyed reading The Code Project articles for many years now, but until this moment I never had the opportunity to contribute.

When I first started with C# I decided to write a diff program in order to really learn the language. I call this program, DeltaScope. What started out as a simple console-based diff tool has gone through many iterations. When I finally stopped work on it, I had a reusable delta engine with GDI+ front-end controls.

Unfortunately, I have not touched this code since 2005. Thankfully my fiancé is very understanding of my “coding time” and I was able to take some time to write this article and bring the code up to C# 3.0 specifications.

Longest Common Subsequence

Many of the entries have excellent explanations of the LCS algorithm. When I wrote my DeltaScope I referred to this paper extensively. http://www.ics.uci.edu/~eppstein/161/960229.html[^]

I find the best way to understand the LCS algorithm, is to think of it like this:
The output of a delta operation is a sequence of instructions to transform file 1 into file 2.

Points of Interest

The core of the system is the DeltaEngine class. The DeltaScope application wraps up the engine in the DeltaScopeEngine class.

public class DeltaScopeEngine 
{
        public bool IgnoreCase { get; set; }
        public bool IgnoreTabs { get; set; }

        public TimeSpan ElapsedTime { get; private set; }
        public long PeakMemoryUsage { get; private set; }

        public List<DeltaBlock> Compare(string tsLeftFile, string tsRightFile)
        {
            var process = Process.GetCurrentProcess();
            process.Refresh();
            var peakWorkingSet64 = process.PeakWorkingSet64;

            var timer = DateTime.Now;
            var deltaEngine = new DeltaEngine();
            var result = deltaEngine.GetDifferences(RipFile(tsLeftFile),
                RipFile(tsRightFile), hashSideDelegate);
            ElapsedTime = DateTime.Now - timer;
            PeakMemoryUsage = process.PeakWorkingSet64 - peakWorkingSet64;
            
            return result;
        }
        private static string RipFile(string tsFileName)
        {
            ...
        }
        private List<DeltaString> hashSideDelegate(string sideValues)
        {
            const string blockLineTerminator = @"\r\n";
            var values = new List<string>(Regex.Split(sideValues, blockLineTerminator));
            var results = new List<DeltaString>();

            for (var lnCnt = 0; lnCnt < values.Count; lnCnt++)
            {
                var lnHashCode = _GenerateHashFromString(values[lnCnt]);
                if (lnHashCode >= 0)
                    results.Add(new DeltaString(values[lnCnt], lnHashCode));
            }
            return results;
        }
        private int _GenerateHashFromString(string tsLine)
        {
            var lnHash = 0;
            var lnRealPtr = 0;
            
            // Handle the upper casing issue...
            var lcaLine = IgnoreCase ? 
                tsLine.ToUpper().ToCharArray() : tsLine.ToCharArray();

            // hash_string(ABCW) = 65*1 + 66*2 + 67*3 + 87*4
            var lnMaxLength = lcaLine.Length;
            for (var lnCnt = 0; lnCnt < lnMaxLength; lnCnt++)
            {
                // Handle the white space characters...
                if (lcaLine[lnCnt] == '\t' && IgnoreTabs)
                    continue;

                lnRealPtr++;
                lnHash += ((byte)lcaLine[lnCnt] * lnRealPtr);
            }

            // return the hash value...
            return (lnHash);
        }
}

The hashAlgorithm delegate lets the developer control how each string is parsed into a list of DeltaStrings. In this simple example we can choose to make the difference case sensative or case insensative, and we can choose to ignore tab characters.

Timings and Memory Consumption

The average run timing is 13ms. I do not include any screen rendering in my timings. As for memory consumption, I always come up with 0 memory used. Which means I am either not using process.PeakWorkingSet64 correctly, or my code is magical. Wink | ;)

Future direction

Many diff tools allow the user to merge the 2 files into 1 new file. With the list of delta blocks that the DeltaEngine generates, it should be possible to allow the user to build a new file. This would require some sort of block choosing ability along with perhaps text editing. Unfortunately the current GDI+ controls are display only.

When I initially wrote this, GDI+ was a pretty new technology, and WPF did not yet exist. I think a good next step would be to use WPF as the display technology for the display. With WPF the swirl block control can be rewritten to include editing abilities.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

NYCChris
Software Developer (Senior)
United States United States
No Biography provided

Comments and Discussions

 
GeneralPrinting PinmemberStevePasha14-Jul-10 20:10 
GeneralRe: Printing PinmemberChristopher Erker15-Jul-10 9:29 
GeneralRe: Printing PinmemberStevePasha16-Jul-10 9:37 
GeneralSource does not build for me due to missing App.Config in the DeltaEngine project PinmvpSacha Barber3-Sep-09 4:51 
GeneralRe: Source does not build for me due to missing App.Config in the DeltaEngine project PinmemberChristopher Erker3-Sep-09 6:57 
GeneralRe: Source does not build for me due to missing App.Config in the DeltaEngine project PinmvpSacha Barber3-Sep-09 7:11 
GeneralRe: Source does not build for me due to missing App.Config in the DeltaEngine project PinmemberChristopher Erker3-Sep-09 7:40 
Generalmetrics PinmvpLuc Pattyn2-Sep-09 17:28 
GeneralRe: metrics PinmemberChristopher Erker2-Sep-09 17:33 
GeneralRe: metrics PinmvpLuc Pattyn2-Sep-09 17:38 
GeneralRe: metrics PinmemberChristopher Erker2-Sep-09 20:08 
GeneralRe: metrics PinmemberChristopher Erker4-Sep-09 10:19 
Generalerror Pinmembermandimandi2-Sep-09 1:01 
GeneralRe: error PinmemberChristopher Erker2-Sep-09 3:45 
GeneralRe: error Pinmembermandimandi2-Sep-09 7:56 
you're welcome!

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.141223.1 | Last Updated 1 Sep 2009
Article Copyright 2009 by NYCChris
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid