Click here to Skip to main content
Click here to Skip to main content
Technical Blog

Tagged as

String comparison in Delphi

, 14 Dec 2012 CPOL
Rate this:
Please Sign up or sign in to vote.
String comparison in Delphi

Have you ever wondered how utilities like Beyond Compare or DIFF are comparing files? They do it (I guess) by solving the longest common subsequence (LCS) problem.

After reading the Wikipedia article linked above, I obtained an overall view of the problem and I looked at the possible resolutions. So, I decided to implement a Delphi class to do the string comparison trick, which is the base for the text file comparison.

Let me put it as follows: given two strings to be compared, I want to highlight in blue the characters added to the first string and in red the characters removed from it. The common (unchanged) characters will keep the default color.

For example:

String 1 = Delphi allows both structural and object oriented programming.
String 2 = Does Delphi allow object oriented programming?

Highlighted differences:

Does Delphi allows both structural and object oriented programming?

The Delphi class looks like this:

type
  TDiff = record
    Character: Char;
    CharStatus: Char;  //Possible values: [+, -, =]
  end;

  TStringComparer = class
  ……………
  public
    class function Compare(aString1, aString2: string): TList<tdiff>;
  end;</tdiff>

When you call TStringComparer.Compare, a generic list of TDiff records is created. A TDiff record contains a character and whether this character was added (CharStatus = ‘+’), removed (CharStatus = ‘-’) or unchanged (CharStatus = ‘=’) in both strings under comparison.

Let’s drop two edits (Edit1, Edit2), a rich edit (RichEdit1) and a button (Button1) on a Delphi form. To highlight the differences put the following code in the OnClick event of the button:

procedure TForm1.Button1Click(Sender: TObject);
var
  Differences: TList<tdiff>;
  Diff: TDiff;
begin
  //Yes, I know...this method could be refactored ;-)
  Differences:= TStringComparer.Compare(Edit1.Text, Edit2.Text);
  try
    RichEdit1.Clear;
    RichEdit1.SelStart:= RichEdit1.GetTextLen;
    for Diff in Differences do
      if Diff.CharStatus = '+' then
      begin
        RichEdit1.SelAttributes.Color:= clBlue;
        RichEdit1.SelText := Diff.Character;
      end
      else if Diff.CharStatus = '-' then
      begin
        RichEdit1.SelAttributes.Color:= clRed;
        RichEdit1.SelText:= Diff.Character;
      end
      else
      begin
        RichEdit1.SelAttributes.Color:= clDefault;
        RichEdit1.SelText:= Diff.Character;
      end;
  finally
    Differences.Free;
  end;
end;</tdiff>

It looks like in the image below:

For the full implementation read further down. Note that various optimizations could be added to the code below, but I didn’t implement them. Anyway, I hope this helps. Feedback is welcome! Feel free to find and correct bugs Wink | ;-)

unit StringComparison;

interface

uses
  Math, Generics.Collections;

type
  TDiff = record
    Character: Char;
    CharStatus: Char;  //Possible values: [+, -, =]
  end;

  TStringComparer = class
  strict private
    type TIntArray = array of array of Integer;
    class function LCSMatrix(X, Y: string): TIntArray;
    class procedure ComputeDifferences(aLCSMatrix: TIntArray;
                                       X, Y: string;
                                       i, j: Integer;
                                       aDifferences: TList<TDiff>);
  public
    class function Compare(aString1, aString2: string): TList<TDiff>;
  end;

implementation

{ TStringComparer }

class function TStringComparer.LCSMatrix(X, Y: string): TIntArray;
var
  m, n,
  i, j: Integer;
begin
  m:= Length(X);
  n:= Length(Y);

  //We need one extra column and one extra row to be filled with zeroes
  SetLength(Result, m + 1, n + 1);

  //First column filled with zeros
  for i := 0 to m do
    Result[i, 0] := 0;

  //First row filled with zeros
  for j:= 0 to n do
    Result[0, j]:= 0;

  //Storing the lengths of the longest common subsequences
  //between prefixes of X and Y
  for i:= 1 to m do
    for j:= 1 to n do
      if X[i] = Y[j] then
        Result[i, j] := Result[i-1, j-1] + 1
      else
        Result[i, j]:= Max(Result[i, j-1], Result[i-1, j]);
end;

class procedure TStringComparer.ComputeDifferences(aLCSMatrix: TIntArray;
                                                   X, Y: string;
                                                   i, j: Integer;
                                                   aDifferences: TList<TDiff>);
var
  CharDiff: TDiff;
begin
  if (i > 0) and (j > 0) and (X[i] = Y[j]) then
  begin
    ComputeDifferences(aLCSMatrix, X, Y, i-1, j-1, aDifferences);
    CharDiff.Character:= X[i];
    CharDiff.CharStatus:= '=';   //The character did not change
    aDifferences.Add(CharDiff);
  end
  else
    if (j > 0) and ((i = 0) or (aLCSMatrix[i,j-1] >= aLCSMatrix[i-1,j])) then
    begin
      ComputeDifferences(aLCSMatrix, X, Y, i, j-1, aDifferences);
      CharDiff.Character:= Y[j];
      CharDiff.CharStatus:= '+'; //The character was added
      aDifferences.Add(CharDiff);
    end
    else if (i > 0) and ((j = 0) or (aLCSMatrix[i,j-1] < aLCSMatrix[i-1,j])) then
    begin
      ComputeDifferences(aLCSMatrix, X, Y, i-1, j, aDifferences);
      CharDiff.Character:= X[i];
      CharDiff.CharStatus:= '-'; //The character was removed
      aDifferences.Add(CharDiff);
    end;
end;

//This is a factory method
class function TStringComparer.Compare(aString1, aString2: string): TList<TDiff>;
var
  Matrix: TIntArray;
begin
  Result:= TList<TDiff>.Create;
  Matrix:= LCSMatrix(aString1, aString2);
  ComputeDifferences(Matrix,
                     aString1, aString2,
                     Length(aString1), Length(aString2),
                     Result);
end;

end.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

yanniel
Software Developer Digital Rapids
Canada Canada
My name is Yanniel Alvarez Alfonso. I was born in San Antonio de los Baños, Havana Province, Cuba on October 24th, 1982.
 
I majored in Information Technology Engineering at José Antonio Echeverría Polytechnic Institute (CUJAE) in Havana City, Cuba (July 2006). After that, I got a Masters Degree in Applied Computer Science at the same University (May 2009).
 
I used to work as a professor of Information Technology at CUJAE. Right now, I work as a Software Developer in Toronto, Canada. I moved to Canada under the Skilled Worker Program on February 26th, 2010.
 
This is my personal blog: Yanniel's notes; in which I write about miscellaneous topics.
 
The link at the end of this sentence compiles an index of all the articles I have written so far about Delphi Programming.

Comments and Discussions

 
QuestionComparing multiline text ? PinmemberMember 95958083-Jan-13 3:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.150331.1 | Last Updated 14 Dec 2012
Article Copyright 2012 by yanniel
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid