Click here to Skip to main content
15,885,365 members
Articles / Programming Languages / C#

How to Write SHA256Sum in C# (or MD5Sum, SHA1Sum)

Rate me:
Please Sign up or sign in to vote.
4.20/5 (2 votes)
15 Jul 2009CPOL5 min read 40.7K   10   1
How to write SHA256Sum in C# (or MD5Sum, SHA1Sum)

Occasionally, you may have the need to create a file ‘fingerprint’ using one of the well known and supported hash programs. The common hash algorithms are:

  • MD5 - Don't use if you can avoid it as this is known to have vulnerabilities and should never be used!
  • RIPEMD160 – This is supported by .NET, but isn't really heavily used. Recommend using SHA256 or SHA512
  • SHA1- If you can avoid it, use SHA256 or SHA512
  • SHA2 Family
    • SHA256
    • SHA384
    • SHA512

If you haven’t come across MD5Sum.exe, SHA1Sum.exe, SHA256Sum.exe, you can find native Windows ports here (or if you are looking for the more official GNU versions, they can be found here). Which if you are just looking for the command line tools, that is probably enough. However, sometimes, you may have the need to do all this work yourself in C#, if so this is the article that should guide you!

First of all, this is going to be fairly simple as the .NET library supports all of the above hash formats, so all we are really talking about doing is showing you the best way to use the supplied .NET runtimes to perform your hashing. So on to the magic (note to avoid width formatting issues, this isn't exactly how I normally format the code!):

C#
/// <summary>
/// Performs the SHA1 Hash function on file
/// </summary>
/// <param name="filename">
/// The filename to be hashed.
/// </param>
/// <returns>
/// SHA1 Hash value associated with the file
/// </returns>
public static string SHA1HashFile(string filename)
{
   string hashedValue = string.Empty;

   //create our SHA1 provider
   SHA1CryptoServiceProvider hashAlgorithm = new SHA1CryptoServiceProvider();

   //hash the data from the file
   byte[] hashedData = hashAlgorithm.ComputeHash(File.ReadAllBytes(filename));

   //loop through each byte in the returned byte array to convert into printed ASCII
   foreach (byte b in hashedData)
   {
      hashedValue += String.Format("{0,2:x2}", b);
   }

   //return the hashed value to the caller
   return hashedValue;
}

This does the SHA1 hashing of the supplied file – and matches the output of GNU version of SHA1Sum.exe. I told you it was simple :-) . Dissecting this code should be pretty trivial:

  1. Create the SHA1 Service provider (System.Security.Cryptography.SHA1CryptoServiceProvider)
  2. Call ComputeHash passing in a byte array.
  3. Take the results and output it as text. In case it wasn't obvious, the results of the hash is a binary blob, hence the need to format it into a string friendly representation.

However, there are really 2 problems with this code:

  1. File.ReadAllBytes – This returns a byte array of the file – pretty much as you would expect! The problem is that if this is a very big file, for example, a hash for a DVD, the entire file needs to be loaded into memory before it gets hashed. Obviously, not the most optimal approach!
  2. This is completely locked into SHA1, you need a new function for any other different hashing function. Not a biggie, but it would be nice to get some reuse in now and then. Definitely useful if you see the full example where we have to use some fall back processing if an algorithm is not available.

Thankfully, fixing this is still pretty trivial. To fix issue 1, rather than using the ComputeHash that takes in the byte array, use the one that takes the stream. This avoids the need for having the entire file in memory before the hash process can start. Out of curiosity, I looked up the publicly available source code for the function to check if it was in fact doing what I thought. Thankfully, it is simple and obvious:

C#
...
// Default the buffer size to 4K.
byte[] buffer = new byte[4096];
int bytesRead;
do {
   bytesRead = inputStream.Read(buffer, 0, 4096);
   if (bytesRead > 0) {
     HashCore(buffer, 0, bytesRead);
  }
} while (bytesRead > 0);
...

So we can see when we use the stream version of HashAlgorithm.ComputeHash Method (Stream) it only will use up a small memory chunk for calculating the hash values. So we are safe from big files from potentially killing the application.

Issue 2 – The .NET team did a nice job of creating base classes, one of which is HashAlgorithm. This is actually the class that implements the hashing ‘interface’. All hash algorithms must derive from this class. So we can use this to our advantage:

C#
/// <summary>
/// Performs a Hash operation on the supplied file.
/// </summary>
/// <param name="filename">
/// The filename to be hashed.
/// </param>
/// <returns>
/// Selected Hash value associated with the file
/// </returns>
public static string HashFile(
          string filename
          , HashAlgorithm hashAlgorithm)
{
   if (!File.Exists(filename))
   {
      throw new ArgumentException(filename + " must exist", "filename");
   }

   string hashedValue = string.Empty;
   byte[] hashedData = null;

   // Create the stream
   using (FileStream fs = File.Open(filename, FileMode.Open, FileAccess.Read))
   {
      hashedData = hashAlgorithm.ComputeHash(fs);
   }

   //loop through each byte in the returned byte array
   foreach (byte b in hashedData)
   {
      //convert each byte and append
      hashedValue += String.Format("{0,2:x2}", b);
   }

   //return the hashed value
   return hashedValue;
}

Now we have a generic function to return back a hash value from any supported .NET hash algorithm. To call it, you could just do “HashFile(filename,new SHA1CryptoServiceProvider())”. Voila! Performance and can trivially support any hashing class the .NET Framework implements.

Ok, so let's get a little more adventurous now. Attached to this entry is a very simple (aka not fully featured) HashSum source code that allows the same executable to be used to provide all the above hashing! However, as a word of caution, you need to be a little careful with the more advanced hashing. For example, Microsoft supplies both “SHA256CryptoServiceProvider” and “SHA256Managed”. On the surface, they look pretty much the same, apart from SHA256CryptoServiceProvider is only available in .NET 3.5 (or higher). However since this uses Operating system cryptographic service providers, they may not be available on the platform your program is running on. If that is the case, then you will get the PlatformNotSupportedException exception thrown:

System.PlatformNotSupportedException was unhandled
  Message="The specified cryptographic algorithm is not supported on this platform."
  Source="System.Core"
  StackTrace:
       at System.Security.Cryptography.CapiNative.AcquireCsp
       (String keyContainer, String providerName, ProviderType providerType, 
       CryptAcquireContextFlags flags, Boolean throwPlatformException)
       at System.Security.Cryptography.CapiHashAlgorithm..ctor
       (String provider, ProviderType providerType, AlgorithmId algorithm)
       at System.Security.Cryptography.SHA256CryptoServiceProvider..ctor()
       at CSharpHacker.Hash.HashSum.Main(String[] args) in X:\GIT\FileHash\FileSum.cs:line 76
       at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
       at System.AppDomain.ExecuteAssembly
       (String assemblyFile, Evidence assemblySecurity, String[] args)
       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.Run
       (ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart()
  InnerException:

Hmmm – you don’t read about that little chestnut in the MSDN help! So if you know you are going to only be running this on a Windows 2008, Vista or Windows 7 or later, you can just use the “SHA256CryptoServiceProvider” version. However, if you may have to support systems such as XP (potentially Windows 2003 as well), you will have to use the managed versions. The safest route would be to provide a graceful fall back mechanism (possibly with a warning) that the CSP version could not be used and using the managed code version instead. This provides the best of both worlds, if the platform supports the CSP version, you can use that (which should give you a speed increase) or you use the managed solution.

C#
case "SHA256SUM":
default:
   try
   {
      hashAlgorithm = new SHA256CryptoServiceProvider();
   }
   catch (PlatformNotSupportedException)
   {
      // Fall back to the managed version if the CSP
      // is not supported on this platform.
      hashAlgorithm = new SHA256Managed();
   }
   break;

This is a second benefit of using the “HashAlgorithm” approach, the underlying code responsible for the generic hashing function is that it doesn't need to know what version (or even algorithm) it is using.

You also have to bear in mind that if you are going to use “SHA256CryptoServiceProvider” (or equivalent), you have to be using .NET 3.5 or greater.

This is a .NET 3.5 project that based off the executable file name it uses that algorithm. So if you rename FileSum.exe to “SHA512Sum.exe”, it will perform the SHA512 hash on the input file, MD5Sum.exe MD5 hash, etc. If the name is not a recognized name, it defaults to using the SHA256 algorithm. This is not designed to be a wholesale replacement for SHA256Sum, etc, but more of a guide how to write a fully featured version. So things missing from this include (and are not limited to :-) ):

  • No support for wildcards (simple enough to add but not there)
  • No support for checking files match the input file (’-c or –check’)
  • Only binary mode is supported, no support for ASCII/Text mode. No support for ‘-t’ or ‘–text
  • No support for standard input processing

Hope you found this useful.

Gareth

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
I'm Gareth and am a guy who loves software! My day job is working for a retail company and am involved in a large scale C# project that process large amounts of data into up stream data repositories.

My work rule of thumb is that everyone spends much more time working than not, so you better enjoy what you do!

Needless to say - I'm having a blast.

Have fun,

Gareth

Comments and Discussions

 
GeneralUpdated source code available Pin
GarethI16-Jul-09 6:21
GarethI16-Jul-09 6:21 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.