12,239,805 members (55,313 online)
Rate this:
See more: , +
I have a problem, i am designing an app for a competition, and one of the main functions requires a comparison of 2 byte arrays. But I cannot compare each array element individually as both of the arrays have more than 13000 elements and they are not nessecarily in the same order. I need to compare them as a whole, and return a percentage of how much they match each other.
Does anyone have any ideas?

Posted 23-Feb-12 23:24pm
Edited 23-Feb-12 23:54pm
v2
johannesnestler 24-Feb-12 5:23am

A (naiv) question: If the corresponding bytes are not in the same order - how you know which byte represents the "same" in the other array? - I'm thinking in the direction of an diff algorithm, used to compare files (something like this: http://www.codeproject.com/Articles/6943/A-Generic-Reusable-Diff-Algorithm-in-C-II) - What you think?
Aleksa Krstic 24-Feb-12 14:09pm

That actually helped me a lot, gave me a couple of ideas. Thanks!

Rate this:

## Solution 1

You can use `SequenceEqual` for that.
http://msdn.microsoft.com/en-us/library/bb348567.aspx

Good luck!
Aleksa Krstic 24-Feb-12 14:08pm

Thank you, but I need something that returns a numerical value (e.g. 86% pecent of an arry matches the other one), not a boolean.
Rate this:

## Solution 2

Just an idea:
```int[] bucket = new int[256];
foreach ( var b in array1 )
{
bucket[b]++;
}

foreach ( var b in array2 )
{
bucket[b]--;
}

double percent = 100;
for ( int i = 0; i < 256; ++i )
{
double val = 100 * ((double)bucket[i] / array1.Length);
percent -= Math.Abs( val );
}

Console.Out.WriteLine( string.Format( "array1 equals array2 to {0}%", percent ) );
```

The array `bucket` keeps track of how many different bytes are present in each of the source arrays.
A positive value means that a byte is x time more often in `array1` than in `array2`. A negative value the other way around.
At the end you can calculate the percentage easily. (At least that's how it should work in my crazy brain )
v2
Rate this:

## Solution 3

Assuming you have two arrays and you compare by position, then you might use the following:

```byte[] a = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
byte[] b = new byte[] { 0, 1, 2, 3, 3, 3, 6, 7 };

int n = Math.Min(a.Length, b.Length);
int m = Math.Max(a.Length, b.Length);
int c = 0;
a.Take(n).Aggregate(0, (i, e) => { if (e == b[i++]) c++; return i; });

Console.WriteLine("Match = {0} = {1}%", c, 100.0 * c / m);```
Rate this:

## Solution 4

Your question is not so clearly stated (what kind of comparison is asked for).
Maybe you are interested in correlation coefficients as used in audio/image compression?
If so, the following document might help: Correlation in Statistics and in Data Compression[^].
Rate this:

## Solution 5

Top Experts
Last 24hrsThis month
 OriginalGriff 780 KARTHIK Bangalore 249 Nigam,Ashish 238 Sergey Alexandrovich Kryukov 238 ppolymorphe 190
 OriginalGriff 9,373 F-ES Sitecore 4,778 Jochen Arndt 4,258 Dave Kreskowiak 4,018 Richard MacCutchan 3,791