Click here to Skip to main content
Click here to Skip to main content

CRC_32

By , 9 Oct 2001
 

Introduction

I recently needed the ability to calculate a CRC-32 value for some very large files, and I wanted to have a progress bar showing the progress of the calculation. I was able to find the algorithm at CreateWindow.com that I modified to suit my needs. The class I came up with is the CRC_32 class which is defined in the CRC_32.h and CRC_32.cpp files included in the demo project.

Testing

This class was not tested with UNICODE builds, nor was it tested on a network with UNC paths. If you find and fix any bugs, dropping me a note at pja@telus.net would be nice.

Known Problem

If the CRC-32 calculations are run in a separate thread, and the thread is terminated prematurely, a memory leak occurs.

The CRC-32 Algorithm

The first step in calculating the CRC-32 value for a data object (a file or any in memory data buffer) is to set up the lookup table. The table consists of 256 unique 32 bit values, one for each character in the ASCII table (0x00 -> 0xFF). The table can be declared as a static table in the source code, or it can be built dynamically at run time. I chose to build the table in the CRC_32 class constructor

CRC_32::CRC_32()
{
    // This is the official polynomial used by CRC-32 
    // in PKZip, WinZip and Ethernet. 
    ULONG ulPolynomial = 0x04C11DB7;

    // 256 values representing ASCII character codes.
    for(int i = 0; i <= 0xFF; i++)
    {
        Table[i] = Reflect(i, 8) << 24;
        for (int j = 0; j < 8; j++)
            Table[i] = (Table[i] << 1) ^ (Table[i] & (1 << 31) ? ulPolynomial : 0);
        Table[i] = Reflect(Table[i],  32);
    }
}

ULONG CRC_32::Reflect(ULONG ref, char ch)
{
    ULONG value = 0;
    // Swap bit 0 for bit 7
    // bit 1 for bit 6, etc.
    for (int i = 1; i < (ch + 1); i++)
    {
        if (ref & 1)
            value |= 1 << (ch - i);
        ref >>= 1;
    } return value;
}

Now that the lookup table has been initialized, it can be used to calculate the CRC-32 value of some data by passing the data through the Calculate() function.

void CRC_32::Calculate(const LPBYTE buffer, UINT size, ULONG &CRC)
{   // calculate the CRC
    LPBYTE pbyte = buffer;

    while(size--)
        CRC = (CRC >> 8) ^ Table[(CRC & 0xFF) ^ *pbyte++];
}

The initial value of the CRC is set to 0xFFFFFFFF, then it is passed through the Calculate() function, and then the final value is XORed with the initial value to generate the CRC-32 value for the data

DWORD CRC = 0xFFFFFFFF;
Calculate ((LPBYTE)buffer, size, CRC);
return CRC ^ 0xFFFFFFFF;

User Functions

CRC_32::CRC32()

Constructs the CRC_32 class object

Parameters :

None.

Returns :

Nothing.

DWORD CRC_32::CalcCRC(LPVOID buffer, UINT size, HWND ProgressWnd/*= NULL*/)

DWORD CRC_32::CalcCRC(LPCTSTR FileName, HWND ProgressWnd/*= NULL*/)

Calculates the CRC-32 value for the given buffer or file.

Parameters :

buffer [in] : a pointer to the data bytes.

size [in] : the size of the buffer.

FileName [in] : the complete path to the file.

ProgressWnd [in] : the HWND of the progress bar.

Returns :

  • The CRC-32 value of the buffer or file if the ProgressWnd is not a window.
  • The HANDLE of the created thread if the ProgressWnd is a window.
  • NULL if an error occurs.

Note :

ProgressWnd is passed through the IsWindow() API function. If IsWindow() returns zero, CalcCRC() will calculate the CRC-32 value directly. If IsWindow() returns nonzero, CalcCRC() will start a separate thread. The thread will send PBM_* progress bar messages to the ProgressWnd. When the thread is finished, the thread will send a WM_CRC_THREAD_DONE message to the parent window of the ProgressWnd (usually a dialog window).

WM_CRC_THREAD_DONE Message

The WM_CRC_THREAD_DONE message is sent to the parent window of the Progress bar window passed to the CalcCRC() function when the thread has finished executing.

  • Thread = (HANDLE)wParam
  • CRC32 = (ULONG)lParam

Thread is the HANDLE of the thread that sent the WM_CRC_THREAD_DONE message. CRC32 is the CRC-32 value of the data passed into the thread. If CRC32 is zero, an error occurred.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

PJ Arends
President
Canada Canada
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionHow I known which polynomial is used to calculate CRCmemberRedhouane_KM3 Mar '07 - 8:10 
Hi.
 
- My file has 7812 bytes of size; all of bytes are set to 0x00,
His CRC = 0xDC1D64FB, but with your program CRC = 0x20D947C4
- How I known which polynomial is used to calculate this CRC
for obtain CRC = 0xDC1D64FB not 0xE6B38EB5.
 
N.B: I change between these polynomials:
Polynomial = 0x04C11DB7 then CRC = 0x20D947C4
Polynomial = 0xEDB88320 then CRC = 0xFBA7848B
Polynomial = 0x???????? then CRC = 0xDC1D64FB
 
Thanks.
GeneralNeed calcul polynomialmemberRedhouane_KM2 Feb '07 - 9:36 
I have a file with his CRC32, I need how calcul his polynomial?
 
thank...
GeneralRe: Need calcul polynomialmemberPJ Arends3 Feb '07 - 11:53 
Umm...what?
 
Sorry, but your question is not very clear.
 
To calculate the CRC32 of a file using this class:
CRC_32 MyCRC;
DWORD CRC32_of_file = MyCRC.CalcCRC(Path_to_File);

 

You may be right
I may be crazy
-- Billy Joel --


Within you lies the power for good, use it!!!

QuestionRe: Need calcul polynomialmemberRedhouane_KM3 Mar '07 - 8:12 
Hi.
 
- My file has 7812 bytes of size; all of bytes are set to 0x00,
His CRC = 0xDC1D64FB, but with your program CRC = 0x20D947C4
- How I known which polynomial is used to calculate this CRC
for obtain CRC = 0xDC1D64FB not 0xE6B38EB5.
 
N.B: I change between these polynomials:
Polynomial = 0x04C11DB7 then CRC = 0x20D947C4
Polynomial = 0xEDB88320 then CRC = 0xFBA7848B
Polynomial = 0x???????? then CRC = 0xDC1D64FB
 
Thanks.

GeneralwarningsmemberWarren D Stevens18 Jan '06 - 6:12 
In Visual Studio 2003, I get two warnings:
 
warning C4311: 'type cast' : pointer truncation from 'HANDLE' to 'DWORD'
on the lines:
return (DWORD)Handle
 
It seems like the class should use DWORD_PTR instead of DWORD
in a number of places, to avoid porting issues, when pointer size changes
(i.e. 64-bit compiles)
 
Warren
GeneralRe: warningsmemberPJ Arends18 Jan '06 - 9:22 
You are probably correct, and I am sure there are more compatibility issues in the code with the newer compilers. When I get my copy of VS2005 (I am waiting for CP to sell it) I will have to go through all my articles and update the code.
 
Thanks for the heads upSmile | :)
 


"You're obviously a superstar." - Christian Graus about me - 12 Feb '03
 
"Obviously ???  You're definitely a superstar!!!" - mYkel - 21 Jun '04
 
"There's not enough blatant self-congratulatory backslapping in the world today..." - HumblePie - 21 Jun '05
 
Within you lies the power for good - Use it!
GeneralThread TerminationmemberBlake Miller10 Apr '03 - 6:12 
It is possible that the thread leaks memory if terminated prematurely for at least two reasons.
 
1. you allocate memory that is not freed unless thread exits gracefully (the obvious answer).
 
2. you use ::CreateThread instead of _beginthreadex. If you read about ::CreateThread versus _beginthreadex, you will see that there are warnings against using C run time library functions unless your thread is started with _beginthreadex. This might contribute to your problem. I did not further investigate (the not so obvious answer).

 
C++/MFC/InstallShield since 1993
GeneralHashing URLssussAnonymous8 Sep '02 - 23:33 
Hi, I've been wondering if it would be se relyable to use CRC32 for a huge URLs hash table?
What kind of probability could arise that 2 entrie would have the same CRC.
 
Thanks.
Smile | :)
GeneralRe: Hashing URLsmemberPJ Arends8 Sep '02 - 23:51 
From what I have read, CRC32 is fine for integrity checking of files/data, but it should not be used for much else as the chance of duplicates is too high. You would be better off using the MD5 algorithm or one of the SHA algorithms.
 


CPUA 0x5041
 
Sonork 100.11743 Chicken Little
 
"So it can now be written in stone as a testament to humanities achievments "PJ did Pi at CP"." Colin Davies
 
Within you lies the power for good - Use it!
GeneralBad checksum algorithmmemberC-J Berg10 Oct '01 - 7:27 
You say that you needed to calculate CRC-32 for _very large_ files, so I thought it would be in order to advice against this. CRC32 is, as you say, a 32-bit checksum of a message. If you calculate the probablility of an error slipping through undetected for a large file (ie, the probablility that an erranous file will have the same checksum as the original), you will notice that it's rather high. (I'm not presenting any calculations here, but it's rather basic mathematics, so I figure you can do it yourself.)
 
Instead, aim for a message digest algorithm. For instance, the well-known MD5 algorithm generates a 128-bit fingerprint, and that is a sufficiently large checksum to eliminate virtually any error from going by undetected. There are several MD algorithms that generates even larger fingerprints, e.g. SHA, but those aren't needed when you only need to detect random errors in files.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 10 Oct 2001
Article Copyright 2001 by PJ Arends
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid