Click here to Skip to main content
15,892,746 members
Articles / Programming Languages / C++
Article

CHash 1.5 - An MFC hashing class

Rate me:
Please Sign up or sign in to vote.
4.09/5 (61 votes)
8 Aug 20052 min read 210.3K   8.5K   106   48
An MFC implementation of hashing files and strings with CRC32, GOST-Hash, MD2, MD4, MD5, SHA-1 and SHA-2 (256/384/512).

Image 1

Introduction

A few times I've seen implementations of individual hashing algorithms, and thought it may be a good idea to group them together in an easy to use class. Well, here it is.

What are hashes anyway?

Hashes are a string of letters/numbers. They are used as a representation of an amount of data, but they are one way, you cannot go from a hash back to the original data; as hashes are fixed length, you also cannot determine the length or amount of the data represented. This lends hashes to practical security uses, as well as integrity uses.

Why use hashes?

There are multiple uses for hashes, the main one being data integrity. For example, a P2P client would use hashes to validate a file on completion, to check it's not corrupt or "fake". In this way, by generating a hash of a file, you can compare it against another hash, and check whether the files are the same.

Using the code

Putting CHash into use is relatively simple.

The main functions that will be used are:

  • DoHash
  • SetHashAlgorithm
  • SetHashFile
  • SetHashOperation
  • SetHashString

An example of hashing a string with MD5:

// Define a CHash object
CHash hashObj;

// Set the algorithm
hashObj.SetHashAlgorithm(MD5);

// Set the operation
hashObj.SetHashOperation(STRING_HASH);

// Set the string
hashObj.SetHashString("String to hash");

// Hash the string
CString outHash = hashObj.DoHash();

An example of hashing a file with SHA-1:

// Define a CHash object
CHash hashObj;

// Set the algorithm
hashObj.SetHashAlgorithm(SHA1);

// Set the operation
hashObj.SetHashOperation(FILE_HASH);

// Set the file
hashObj.SetHashFile("C:\\Windows\\Explorer.exe");

// Hash the file
CString outHash = hashObj.DoHash();

The code is the same throughout, except for SHA-2, which has an extra function, SetSHA2Strength, which takes one parameter, the strength of the hash, which can be 256, 384 or 512.

An example usage of this is:

// Define a CHash object
CHash hashObj;

// Set the operation
hashObj.SetHashOperation(FILE_HASH);

// Set the algorithm
hashObj.SetHashAlgorithm(SHA2);

// Set the SHA-2 strength
hashObj.SetSHA2Strength(256);

// Set the file
hashObj.SetHashFile("C:\\Windows\\Explorer.exe");

// Hash the file
CString outHash = hashObj.DoHash();

Choosing which algorithms to include

In version 1.5, I made the class modular, so you can exclude/include specific algorithms (cutting down on unnecessary code if you don't want certain ones). To choose which you want to use, go to CHash.h, and find:

// Choose which algorithms you want
// Put 1s to support algorithms, else 0 to not support

Under here you will find defines such as:

#define        SUPPORT_CRC32          1

Change the 1s to 0s if you wish to exclude an algorithm.

Hashing styles

In version 1.2, I added hashing styles. This allows the programmer to customize the output hashes. There are four styles:

  • Lowercase, no spaces:
    b4df98798c02b7c7a500d18632bf5b7d
  • Lowercase, spaces:
    b4 df 98 79 8c 02 b7 c7 a5 00 d1 86 32 bf 5b 7d
  • Uppercase, no spaces:
    B4DF98798C02B7C7A500D18632BF5B7Dd
  • Uppercase, spaces:
    B4 DF 98 79 8C 02 B7 C7 A5 00 D1 86 32 BF 5B 7D

These can be set with SetHashFormat().

Acknowledgements

  • Thanks to Dominik Reichl for his excellent ReHash program.
  • Thanks to Markku-Juhani O. Saarinen for his GOST-Hash implementation.
  • Thanks to Dr. Brian Gladman for the SHA-1/SHA-2 implementations.

History

  • 1.0
    • 1st May, 2005: First public release.
  • 1.1
    • 2nd May, 2005: Added SetHashAlgorithm() and DoHash() as recommended.
  • 1.2
    • 3rd May, 2005:
      • Added hashing styles.
      • Added the GOSTHash algorithm.
      • Added GetHashAlgorithm().
      • Rewrote the hashing functions to be more efficient.
  • 1.3
    • 4th May, 2005: Updated the demo project.
  • 1.5
    • 8th May, 2005:
      • Added CRC32.
      • Made the class "modular" so that you can exclude algorithms from compile.
      • Added GetHashFormat().
      • Updated the code as suggested.
  • 1.6
    • 3rd August, 2005:
      • Fixed a memory leak.
      • Made each hash more efficient, tidied up code in general.
  • License

    This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

    A list of licenses authors might use can be found here


    Written By
    CEO
    United States United States
    This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

    Comments and Discussions

     
    GeneralRe: check if file exists before attempting hash Pin
    .rich.w31-May-05 0:37
    .rich.w31-May-05 0:37 
    Generalnice work.. Pin
    bevpet26-May-05 18:00
    bevpet26-May-05 18:00 
    GeneralSpeed Pin
    dennisV24-May-05 19:08
    dennisV24-May-05 19:08 
    GeneralRe: Speed Pin
    .rich.w25-May-05 4:01
    .rich.w25-May-05 4:01 
    GeneralNewbie on hashes Pin
    Johann Gerell14-May-05 22:46
    Johann Gerell14-May-05 22:46 
    GeneralRe: Newbie on hashes Pin
    .rich.w15-May-05 8:24
    .rich.w15-May-05 8:24 
    GeneralRe: Newbie on hashes Pin
    Anonymous16-Jun-05 12:16
    Anonymous16-Jun-05 12:16 
    GeneralRe: Newbie on hashes Pin
    StrayJay12-Aug-05 9:54
    StrayJay12-Aug-05 9:54 
    Anonymous wrote:
    NO!
    it was already proven by some mathematician that the MDx algorithms can produce dups.


    You don't have to be a mathematician to prove that. Wink | ;-) A quick elaboration of the previous poster's remark 'Could it be that any file of any size be reduced to just 32 hex digits (*)?', for absolute newbies:

    Consider the following set of sentences:

    I think you're crazy.
    I think you think you're crazy.
    I think you think I think you're crazy.

    ...etc.

    It's clear that you can expand this set literally indefinitely by inserting 'you think' and 'I think' into the latest sentence in your set.

    However, a 128-bit hash 'only' has a finite number (2-to-the-power-of-128, or roughly 340,000,000,000,000,000,000,000,000,000,000,000,000) of different values. Since the above set of sentences has infinitely many members, if you were to calculate a hash for each sentence, at some point two of the sentences will be assigned the same hash. This is called a collision. If you were able to create sentences indefinitely (make sure you do this in a portable environment, as the Sun will stop shining in about 6 billion years), and calculate hashes for each, you should even find that each possible hash is assigned to an infite number of sentences.

    Compare this to owning ten pigeon houses, and a smaller number of pigeons. You can buy new ones, and the pigeons can procreate. As long as you own ten pigeons or less, each of the pigeons will have its own house. But as soon as you get to own more than ten pigeons, you'll have to make pigeons share their houses. (The pigeon houses represent the hash codes, and the pigeons represent the sentences.)

    How comfortable you can be, using a particular well-known hashing algorithm is entirely dependent on what you want to use it for:

    If you have a collection of, say, hundreds of thousands of pictures of beautiful ladies (I had to stretch my imagination to come up with that example) and you want to be able to check whether a picture you find online is already in your collection, then you can create a database of hashes based on any well-known algorithm. If two pictures are assigned the same hash you can be sure that they are in fact the same picture. Do note that two pictures may look the same, but as their file contents differ, they will be assigned different hashes.

    If, on the other hand, you want to use the hash to determine access rights to some valuable resource (e.g., access to Mr Bill Gates' bank account) that possibly brilliant minds would love to put some effort into getting their hands on, you should spend some more time investigating what algorithm to use. MD5 would be a bad choice, and SHA might be a viable option. You'd probably still have to hope that the NSA (the U.S. National Security Agency) doesn't employ greedy mathematicians, though!



    (*) Note that 32 hex digits equals 128 bits.
    GeneralGood work and a lot of updates Pin
    Mike Angel Martin7-May-05 8:21
    Mike Angel Martin7-May-05 8:21 
    GeneralRe: Good work and a lot of updates Pin
    .rich.w7-May-05 22:53
    .rich.w7-May-05 22:53 
    GeneralMake it more complete. Pin
    WREY3-May-05 5:31
    WREY3-May-05 5:31 
    GeneralRe: Make it more complete. Pin
    .rich.w3-May-05 5:48
    .rich.w3-May-05 5:48 
    GeneralDone Pin
    .rich.w3-May-05 6:35
    .rich.w3-May-05 6:35 
    GeneralVery Nice. Pin
    WREY8-May-05 12:13
    WREY8-May-05 12:13 
    GeneralHash Pin
    Wes Aday2-May-05 9:42
    professionalWes Aday2-May-05 9:42 
    GeneralRe: Hash Pin
    .rich.w2-May-05 10:11
    .rich.w2-May-05 10:11 
    GeneralRe: Hash Pin
    Wes Aday2-May-05 10:46
    professionalWes Aday2-May-05 10:46 
    GeneralGood Work Pin
    Jaxyboi1-May-05 23:45
    Jaxyboi1-May-05 23:45 
    GeneralGood Idea - Bad implementation Pin
    Mike Eriksson1-May-05 22:43
    Mike Eriksson1-May-05 22:43 
    GeneralRe: Good Idea - Bad implementation Pin
    .rich.w1-May-05 23:39
    .rich.w1-May-05 23:39 
    GeneralRe: Good Idea - Bad implementation Pin
    Mike Eriksson1-May-05 23:53
    Mike Eriksson1-May-05 23:53 
    GeneralNew implementation, article, source and demo updated Pin
    .rich.w2-May-05 0:00
    .rich.w2-May-05 0:00 
    GeneralRe: Good Idea - Bad implementation Pin
    Blake Miller2-May-05 9:23
    Blake Miller2-May-05 9:23 
    GeneralRe: Good Idea - Bad implementation Pin
    StrayJay3-May-05 12:11
    StrayJay3-May-05 12:11 
    GeneralNew version, 1.5 Pin
    .rich.w7-May-05 0:45
    .rich.w7-May-05 0:45 

    General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

    Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.