12,703,012 members (31,249 online)
Rate this:
See more: , +
I have a problem, i am designing an app for a competition, and one of the main functions requires a comparison of 2 byte arrays. But I cannot compare each array element individually as both of the arrays have more than 13000 elements and they are not nessecarily in the same order. I need to compare them as a whole, and return a percentage of how much they match each other.
Does anyone have any ideas?

Posted 23-Feb-12 23:24pm
Updated 23-Feb-12 23:54pm
v2
johannesnestler 24-Feb-12 5:23am

A (naiv) question: If the corresponding bytes are not in the same order - how you know which byte represents the "same" in the other array? - I'm thinking in the direction of an diff algorithm, used to compare files (something like this: http://www.codeproject.com/Articles/6943/A-Generic-Reusable-Diff-Algorithm-in-C-II) - What you think?
Aleksa Krstic 24-Feb-12 14:09pm

That actually helped me a lot, gave me a couple of ideas. Thanks!

Rate this:

## Solution 1

You can use `SequenceEqual` for that.
http://msdn.microsoft.com/en-us/library/bb348567.aspx

Good luck!
Aleksa Krstic 24-Feb-12 14:08pm

Thank you, but I need something that returns a numerical value (e.g. 86% pecent of an arry matches the other one), not a boolean.
Rate this:

## Solution 2

Just an idea:
```int[] bucket = new int[256];
foreach ( var b in array1 )
{
bucket[b]++;
}

foreach ( var b in array2 )
{
bucket[b]--;
}

double percent = 100;
for ( int i = 0; i < 256; ++i )
{
double val = 100 * ((double)bucket[i] / array1.Length);
percent -= Math.Abs( val );
}

Console.Out.WriteLine( string.Format( "array1 equals array2 to {0}%", percent ) );
```

The array `bucket` keeps track of how many different bytes are present in each of the source arrays.
A positive value means that a byte is x time more often in `array1` than in `array2`. A negative value the other way around.
At the end you can calculate the percentage easily. (At least that's how it should work in my crazy brain ;))
v2
Rate this:

## Solution 3

Assuming you have two arrays and you compare by position, then you might use the following:

```byte[] a = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
byte[] b = new byte[] { 0, 1, 2, 3, 3, 3, 6, 7 };

int n = Math.Min(a.Length, b.Length);
int m = Math.Max(a.Length, b.Length);
int c = 0;
a.Take(n).Aggregate(0, (i, e) => { if (e == b[i++]) c++; return i; });

Console.WriteLine("Match = {0} = {1}%", c, 100.0 * c / m);```
Rate this:

## Solution 4

Your question is not so clearly stated (what kind of comparison is asked for).
Maybe you are interested in correlation coefficients as used in audio/image compression?
If so, the following document might help: Correlation in Statistics and in Data Compression[^].
Rate this:

## Solution 5

Top Experts
Last 24hrsThis month
 OriginalGriff 463 Jochen Arndt 195 Peter Leow 143 Richard Deeming 130 Richard MacCutchan 115
 OriginalGriff 4,554 Peter Leow 2,476 ppolymorphe 2,012 Mika Wendelius 1,813 Jochen Arndt 1,674