Click here to Skip to main content
15,884,176 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
I have tried some online conversion tools for converting different files to SHA256 hashes.

I have three question :

1- Whenever I convert an exe file to a SHA256 hash, what does the converter exactly do ? does the whole content of the file get converted to the hash code or just simple attributes like name and size ,.. .

2- How can do it in C#, for example reading exe files and then converting them into hashes.

3- If I convert infected files into hashes, can I use them to detect viruses ?

I need to know this because I really need to learn how antiviruses really work.
Posted
Comments
Sergey Alexandrovich Kryukov 7-Feb-13 15:37pm    
1) What do you do, exactly?
—SA
Jackson Mackson 7-Feb-13 15:40pm    
I have tried some online conversion tools, they convert executable files into hash codes. I'd like to know what part of the file is converted, is the the whole content or just the name or size of it which is being converted.
Sergey Alexandrovich Kryukov 7-Feb-13 16:17pm    
You don't need any with .NET. I answered all your questions in detail, please see.
First two links answers your first question completely.
—SA
Jackson Mackson 7-Feb-13 16:19pm    
Thank you Sergey
Sergey Alexandrovich Kryukov 7-Feb-13 16:49pm    
My pleasure.
Good luck, call again.
—SA

Please see my comment. First question needs clarification, but you can answer it yourself if you read on the rest.

First of all, please see:
http://en.wikipedia.org/wiki/Cryptographic_hash_function[^],
http://en.wikipedia.org/wiki/Sha2[^].

If should explain you what happens if you calculate SHA265 of anything.

About C# programming of hash, everything is already implemented for you in .NET FCL. Please see: http://msdn.microsoft.com/en-us/library/system.security.cryptography.hashalgorithm.aspx[^].

Now, infected files. You don't "convert" anything, you just calculate the hash of data.

No, you cannot use the cryptographic hash function of the whole file to detect viruses. If you read about the properties of cryptographic hash functions, you will clearly see it. The function is used for different purposes: it can help to detect that some executable files have been changed. If you store the hash functions of some set of executable files and check up on regular basis that newly calculated hash function is the same as the stored value, you can have a pretty good guarantee that the files were not changed. The only legitimate reason for an executable file to change should be its upgrade or rebuild.

It's infeasible to modify a file and its hash function in a consistent manner to cheat such system, due to the same properties of the cryptographic hash function. A similar but more powerful approach is using digital signatures on executable files, based on a very different technology, public-key cryptography. Please see:
http://en.wikipedia.org/wiki/Digital_signature[^],
http://en.wikipedia.org/wiki/Public-key_cryptography[^].

Among other things, it makes the files self-protected.

Hash functions are actually used for detecting viruses by comparing, but it has nothing to do with what you tried to suggest. However, the approach based on the database of know virus signatures, is not serious enough. You can consider it only as a first line of defense. It can detect the simplest viruses quickly, that's it. "Serious" viruses are polymorphic/metamorphic:
http://en.wikipedia.org/wiki/Computer_virus[^].

Digging into the complex field of virus detection would be a huge overkill, well beyond the normal format of this forum, and, I guess, beyond your goals.

—SA
 
Share this answer
 
Have a look at this for example

http://msdn.microsoft.com/en-us/library/system.security.cryptography.sha256.aspx[^]

.. It applies the SHA256 Hash algorithm to the entire file (ok, so it also does it for all files in a directory) - this answer points 1 and 2 - point 3 is a lot tougher

For point 3, 'maybe' .. if you stored the hash of the file at time = t0, and, at some later time t1 recalculated the hash, compared t0-hash to t1-hash and they were different, then, 'something has happened' to the file.. is this enough to say it was a virus - no, its not .. you then have to run your suspect file through a process looking for the 'signatures' of all known viruses ...

btw, you could have answered point 1 if you'd done some googling on the sha256 algorithm

'g'
 
Share this answer
 
1. A hash value should depend on all data in the file and should have a avalanche effect if any data is modified, i.e. a small change in the data should make a large change in the hash value.

2. Try e.g. Google. After one search I found link all you probably gonna need for a long time when it comes to cryptography.

3. No, not for sure but you can be pretty sure the file is modified in some way or corrupt if the hash value differs from the original one.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900