Click here to Skip to main content
15,894,291 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
whenever compare byte to byte files using hash algrithem i want to percentage how much

content matching with another byte file.m

my basic code like as:


C#
private void btnCompare_Click(object sender, EventArgs e)
{
    if(txtFile1.Text != "" && txtFile2.Text !="")
   {
       HashAlgorithm ha = HashAlgorithm.Create();
       FileStream f1 = new FileStream(txtFile1.Text, FileMode.Open);
       FileStream f2 = new FileStream(txtFile2.Text, FileMode.Open);
       /* Calculate Hash */
       byte[] hash1 = ha.ComputeHash(f1);
       byte[] hash2 = ha.ComputeHash(f2);
       f1.Close();
       f2.Close();
       /* Show Hash in TextBoxes */
       txtHash1.Text = BitConverter.ToString(hash1);
       txtHash2.Text = BitConverter.ToString(hash2);
       /* Compare the hash and Show Message box */
       if (txtHash1.Text == txtHash2.Text)
       {
           MessageBox.Show("Files are Equal !");
       }
       else
       {
           MessageBox.Show("Files are Diffrent !");
       }
   }
}
private void btnSelectFile1_Click(object sender, EventArgs e)
{
   OpenFileDialog ofd = new OpenFileDialog();
   if (ofd.ShowDialog() == DialogResult.OK)
   {
        txtFile1.Text = ofd.FileName;
   }
}
private void btnSelectFile2_Click(object sender, EventArgs e)
{
   OpenFileDialog ofd = new OpenFileDialog();
   if (ofd.ShowDialog() == DialogResult.OK)
   {
        txtFile2.Text = ofd.FileName;
   }
}


in this i had shown equal or not but i want percentage matching


please help me.

thank u.
Posted
Updated 26-Mar-15 1:05am
v2

You cannot compare using hash values and get a percentage: hashes don't work like that.
Hashes work by using maths to change the input data into a number: for example a simple hash could be to add each byte together and discard any overflow:
01 + 02 + 03 gives a hash of 06
You can't use hashes to give you a "percentage difference", because this give a very similar hash value:
00 + 00 + 05 gives a hash of 05
The hash values are only one apart - but share no common values at all!

In order to give a percentage, you will have to manually compare each byte in the two files.
 
Share this answer
 
v2
A Hashing-Algorithm won't help you at all. You can not get any other information from a Hashing-Algorithm than whether two bytestreams are different or (probably) identical.

Please explain what exactly your definition of "percentage matching" is. Yesterday I've seen two completely different approaches from you - Levenshtein Distance and byte-by-byte comparison - and I have no idea what your idea of percentage matching is. For example, which result of "percentage match" would you expect from the following? :

File 1:
The quick brown fox jumps over the lazy dog.


File 2:
quick brown fox jumps over the lazy dog.
 
Share this answer
 
v2
Comments
Krishna Veni 26-Mar-15 8:04am    
Now i am taken two pdf files and those pdf files compared and LET how much content is match with one pdf to another pdf and consider content matching then compute percentage of content i.e
percentage maching
Sascha Lefèvre 26-Mar-15 8:09am    
I'm sorry, I don't understand :(
Please just tell me: For my example above, what should be the result of the content match? Something like 90% ?
Krishna Veni 26-Mar-15 8:33am    
1.for example one pdf contain sentence like as "Hi,how r u?" and another pdf contain sentence like as "Hi,how r u
?" and compared both of them.these two pdf content are same thats why 100 % content matching.

another one example is

2.for example one pdf contain sentence like as "Hi,how r u?" and another pdf contain sentence like as "Hi,how r u?fine" and compared both of them.these two pdf content are not same and some charcters are different thats why not 100% content matching then compute the percentage howmuch content is matched . i think may be these two pdf content matching is 75%.

i want this type of code.are u understand?
Sascha Lefèvre 26-Mar-15 8:40am    
Yes, but to be certain, let's take another example:
"Hi, how r u?" and "Hi, how are u?"
Should that be about 80% ?
Krishna Veni 26-Mar-15 9:01am    
if u understand,please help me. i don't want to correct 100 % matching. my intention is how much content is matching in terms of percentage like 80%,90%,75% e.t.c.please help me.if any approaches there please tell me.any links will be there pls give me.finally how can u achieve this.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900