Click here to Skip to main content
Click here to Skip to main content

CrcStream stream checksum calculator

By , 8 Oct 2005
 

Introduction

CRC (Cyclic Redundancy Check) is commonly used as a way to confirm that a file had not corrupted during download. While convenient, it takes some time to read the data off of the disk after downloading for the check. It would be convenient if applications checked the CRC on-the-fly during download, so as not to waste idle CPU time and disk read time.

Downloading is done at a relatively leisurely pace (typically anywhere between 5-300kb/s) and over a long period of time, so it makes for a great opportunity to process data without impeding performance. Although ugly and impractical for most applications (it'd be safe to assume that most users think they've "broken the intarweb" when they see a hex number), displaying the CRC to the user immediately as a download finishes can often be a well-appreciated bonus.

This class passively calculates CRCs as data passes through it, ready to be used at any time.

Using the code

To calculate the CRC of a file as it is read to the end, create a new CrcStream passing the FileStream as an argument, and use the ReadCrc property to retrieve the CRC. Be sure to use the new CrcStream instead of the file stream to read from the file; otherwise the checksum will not be calculated.

//Open a file stream, encapsulate it in CrcStream
FileStream file = new FileStream(filename, FileMode.Open);
CrcStream stream = new CrcStream(file);

//Use the file somehow -- in this case, read it as a string
StreamReader reader = new StreamReader(stream);
string text = reader.ReadToEnd();

//Print the checksum
Console.WriteLine("CRC: " + stream.ReadCrc.ToString("X8"));

There are four public members in addition to the abstract Stream overrides:

  • ReadCrc - gets the checksum of the data that was read through the stream.
  • WriteCrc - gets the checksum of the data that was written to the stream.
  • ResetChecksum - resets the CRC values.
  • Stream - gets the encapsulated stream.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Rei Miyasaka
Canada Canada
Member
The cows are here to take me home now...

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralExcellent!memberJonKristian31 Mar '11 - 22:56 
Thanks for this gem! This helped me a LOT and saved my weekend! Smile | :)
 
Keep on truckin'!
GeneralMy vote of 5memberdrago697 Mar '11 - 20:59 
This is exactly what I was looking for. Thank you!!
GeneralMy vote of 5memberChewsHumans6 Sep '10 - 12:00 
I haven't tested this yet, but if it works, it's just what I'm looking for. Cheers!
GeneralRe: My vote of 5memberChewsHumans6 Sep '10 - 13:11 
Yep, it works fine! Thanks very much!
NewsCrcStream in VB.netmemberMike19537 Nov '08 - 11:16 
Great job! I don't know if this is the right place for this but I needed it in VB.net so I rewrote.
Code is below.
 

Mike
 

 
Imports System
Imports System.Collections.Generic
Imports System.Text
Imports System.IO
 

Namespace crcCheckSum
 
''' <summary>
''' Encapsulates a <see cref="System.IO.Stream" /> to calculate the CRC32 checksum on-the-fly as data passes through.
''' </summary>
Public Class CrcStream
Inherits Stream
 
Private Shared table As UInteger() = GenerateTable()
Private m_stream As Stream
Private m_readCrc As UInteger = 4294967295
Private m_writeCrc As UInteger = 4294967295
 

''' <summary>
''' Encapsulate a <see cref="System.IO.Stream" />.
''' </summary>
''' <param name="stream">The stream to calculate the checksum for.</param>
Public Sub New(ByVal stream As Stream)
Me.m_stream = stream
End Sub
 

''' <summary>
''' Gets the underlying stream.
''' </summary>
Public ReadOnly Property Stream() As Stream
Get
Return m_stream
End Get
End Property
 
Public Overloads Overrides ReadOnly Property CanRead() As Boolean
Get
Return m_stream.CanRead
End Get
End Property
 
Public Overloads Overrides ReadOnly Property CanSeek() As Boolean
Get
Return m_stream.CanSeek
End Get
End Property
 
Public Overloads Overrides ReadOnly Property CanWrite() As Boolean
Get
Return m_stream.CanWrite
End Get
End Property
 
Public Overloads Overrides Sub Flush()
m_stream.Flush()
End Sub
 
Public Overloads Overrides ReadOnly Property Length() As Long
Get
Exit Property
End Get
End Property
 
Public Overloads Overrides Property Position() As Long
Get
Exit Property
End Get
Set(ByVal value As Long)
m_stream.Position = value
End Set
End Property
 
Public Overloads Overrides Function Seek(ByVal offset As Long, ByVal origin As SeekOrigin) As Long
Return m_stream.Seek(offset, origin)
End Function
 
Public Overloads Overrides Sub SetLength(ByVal value As Long)
m_stream.SetLength(value)
End Sub
 
Public Overloads Overrides Function Read(ByVal buffer As Byte(), ByVal offset As Integer, ByVal count As Integer) As Integer
count = m_stream.Read(buffer, offset, count)
m_readCrc = CalculateCrc(m_readCrc, buffer, offset, count)
Return count
End Function
 
Public Overloads Overrides Sub Write(ByVal buffer As Byte(), ByVal offset As Integer, ByVal count As Integer)
m_stream.Write(buffer, offset, count)
m_writeCrc = CalculateCrc(m_writeCrc, buffer, offset, count)
End Sub
 
Private Function CalculateCrc(ByVal crc As UInteger, ByVal buffer As Byte(), ByVal offset As Integer, ByVal count As Integer) As UInteger
Dim i As Integer = offset, [end] As Integer = offset + count
While i < [end]
crc = (crc >> 8) Xor table((crc Xor buffer(i)) And &HFF)
i += 1
End While
Return crc
End Function
 
Private Shared Function GenerateTable() As UInteger()
Dim table As UInteger() = New UInteger(255) {}
Dim crc As UInteger
Const poly As UInteger = 3988292384
For i As UInteger = 0 To table.Length - 1
crc = i
For j As Integer = 8 To 1 Step -1
If (crc And 1) = 1 Then
crc = (crc >> 1)
crc = crc Xor poly
Else
crc >>= 1
End If
Next
table(i) = crc
Next
Return table
End Function
 
''' <summary>
''' Gets the CRC checksum of the data that was read by the stream thus far.
''' </summary>
Public ReadOnly Property ReadCrc() As UInteger
Get
Return m_readCrc 'Xor &HFFFFFFFF
End Get
End Property
 

''' <summary>
''' Gets the CRC checksum of the data that was written to the stream thus far.
''' </summary>
Public ReadOnly Property WriteCrc() As UInteger
Get
Return m_writeCrc ' Xor &HFFFFFFFF
End Get
End Property
 
''' <summary>
''' Resets the read and write checksums.
''' </summary>
Public Sub ResetChecksum()
m_readCrc = 4294967295
m_writeCrc = 4294967295
End Sub
End Class
End Namespace
QuestionHow does it handle big files?memberDrDtieltirjt5 Mar '08 - 2:27 
How does it handle big files?
AnswerRe: How does it handle big files?memberreinux5 Mar '08 - 2:35 
It works really well with big files, especially if you're already reading or writing them for other purposes. The main idea of this class is that everything is done on-the-fly, thus getting rid of any significant overhead and wait times.
GeneralRe: How does it handle big files?memberDrDtieltirjt13 Mar '08 - 1:31 
Hello again,
since ReadToEnd produces an out of memory exception, one must use the code in another way, but how?
 
Regards
GeneralRe: How does it handle big files?memberreinux13 Mar '08 - 10:05 
You're getting an OutOfMemoryException because you're reading in a huge stream all at once.
 
You have to read it in a little bit at a time, like this:
 
byte[] buffer = new byte[4096];
int length;
 
while((length = buffer.Read(buffer, 0, buffer.Length)) != 0)
{
   // Do whatever you need to do in here
}
 
Console.WriteLine("CRC: " + stream.ReadCrc.ToString("X8"));
 
The CRC is calculated each time you call Read, but it won't be the CRC of the complete file until you've read the entire file.
GeneralRe: How does it handle big files?memberChristian Loft16 Mar '08 - 21:55 
you are right. thx
GeneralRe: How does it handle big files?memberBuzz Weetman22 Apr '08 - 5:01 
I think you buffer'ed when you should have stream'ed:
 
while((length = stream.Read(buffer, 0, buffer.Length)) != 0)
GeneralRe: How does it handle big files?memberreinux22 Apr '08 - 9:31 
Oops, right you are.
 
Thanks.
GeneralYet another thank youmemberzimmerware24 May '07 - 10:49 
Big Grin | :-D
GeneralSmall Enhancementmembernerd_biker20 Mar '07 - 2:37 
Hi,
 
Great bit of code and very useful.
 
I wanted to be able to read sections of a file and calculate the CRC for only that part, so I added a ReadLine method. I thought I'd post it here in case anyone else finds it useful. It's a bit of a hack but it works!
public string ReadLine()
{
    StringBuilder sb = new StringBuilder();
    int b;
    b = ReadByte();
    while (b >= 0 && b != '\n' && b != '\r')
    {
        sb.Append((char)b);
        b = ReadByte();
    }
    if (b == -1) // End of file reached
    {
        return null;
    }
    else 
    {
        // Discard any EOL characters.
        int nextChar = ReadByte();
        if (nextChar != '\n' && nextChar != '\r')
        {
            Seek(-1, SeekOrigin.Current);
        }
        return sb.ToString();
    }
}
 
The file I am checking contains blocks of text, separated with blank lines. I want a CRC of each block so I use the ReadLine method to read up to the next blank line, get the checksum, reset the CRC by calling ResetChecksum() and continue reading the file to the next blank line.
 
Anthony
 
----
I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone.
-Bjarne Stroustrup

GeneralRe: Small Enhancementmemberreinux20 Mar '07 - 11:12 
Cool, thanks!
QuestionAdapted VersionmemberFernandoNunes30 Mar '06 - 2:45 
Hi mate,
 
Very nice work On-The-Fly CRC and with the polynomial.
 
With your class, we can check any type of stream, but if the target are always files, is there a reason why we can't derive it from FileStream and directly implement the CRC on it ?
 
It avoids using a FileStream to acquire the Stream and then encapsulate that Stream in your own class.
I've tested this changes but i can't get any performance improvements D'Oh! | :doh:
 
But can you check if this is valid ??
Here's the adaptation:
 

class FileStreamWithCRC : FileStream
{
private uint _readCRC = unchecked(0xFFFFFFFF);
private uint _writeCRC = unchecked(0xFFFFFFFF);
private static uint[] GenerateTable()
{
unchecked
{
uint[] table = new uint[256];
 
uint crc;
const uint poly = 0xEDB88320;
for (uint i = 0; i < table.Length; i++)
{
crc = i;
for (int j = 8; j > 0; j--)
{
if ((crc & 1) == 1)
crc = (crc >> 1) ^ poly;
else
crc >>= 1;
}
table[i] = crc;
}
 
return table;
}
 
}
 
private static uint[] table = GenerateTable();
 
public uint ReadCRC
{
get { return unchecked(this._readCRC ^ 0xFFFFFFFF); }
}
 
public uint WriteCRC
{
get { return unchecked(this._writeCRC ^ 0xFFFFFFFF); }
}
 
public FileStreamWithCRC(String filePath, FileMode fileMode, FileAccess fileAccess, FileShare fileShare): base(filePath, fileMode, fileAccess, fileShare)
{

}
 
// Insert more constructors if needed
//public FileStreamWithCRC(String filePath, FileMode fileMode) : base(filePath, fileMode)
//{
//}
 
uint CalculateCRC(uint crc, byte[] buffer, int offset, int count)
{
unchecked
{
for (int i = offset, end = offset + count; i < end; i++)
crc = (crc >> 8) ^ table[(crc ^ buffer[i]) & 0xFF];
}
return crc;
}
 
public void ResetChecksum()
{
this._readCRC = unchecked(0xFFFFFFFF);
this._writeCRC = unchecked(0xFFFFFFFF);
}
 
public override int Read(byte[] array, int offset, int count)
{
count = base.Read(array, offset, count);
this._readCRC = CalculateCRC(this._readCRC, array, offset, count);
return count;
}
 
public override void Write(byte[] array, int offset, int count)
{
base.Write(array, offset, count);
 
this._writeCRC = CalculateCRC(this._writeCRC, array, offset, count);
}
}
 


AnswerRe: Adapted Versionmemberreinux31 Mar '06 - 6:49 
Cool Big Grin | :-D
 
The code works fine as far as I can tell. I just gave it a quick run through reading a file. Definitely a lot more convenient if you know you're going to be dealing with a file rather than some other type of stream.
 
A couple reasons it won't affect performance:
1. When you override a member of a class, internally it does the same thing as when you call two methods one after another and pass parameters along -- just that it does that bookkeeping and method calling for you automatically. Even if it does make a slight difference, it'd only be a couple dozen CPU cycles per call at most. Each call takes probably at least a few million cycles, so the difference is immeasurable.
2. On an even broader scale, the time it takes to load a file off the disk generally dwarfs the time it takes to calculate the CRC (modern hard drives are that slow). Performance is a bit more significant only after the first time you read a file, because Windows caches the file onto memory.
 
Thanks!
GeneralFantastic!memberEmma Burrows15 Mar '06 - 23:57 
I was looking for some simple code to plug into my application and calculate checksums and this is exactly what I wanted! Thanks for sharing.
GeneralRe: Fantastic!memberreinux16 Mar '06 - 0:00 
Smile | :)
Questionchecksum?memberChristoph Ruegg8 Oct '05 - 13:39 
thanks for your contribution! Just a detail: strictly speaking, CRC is not a checksum algorithm. While checksums (used in internet protocols like IPv4) are one class of error detection codes, CRC is another one (used in LAN protocols like Ethernet). Parity checks form yet another such class...
AnswerYep, checksummemberreinux8 Oct '05 - 14:11 
Hmm... what is it called then?
 
From what I know browsing around, it's called a checksum algorithm.
 
Here's the definition at Wikipedia[^]:
 
"A cyclic redundancy check (CRC) is a type of hash function used to produce a checksum, which is a small number of bits, from a large block of data, such as a packet of network traffic or a block of a computer file, in order to detect errors in transmission or storage. "
GeneralRe: Yep, checksummemberChristoph Ruegg8 Oct '05 - 15:01 
Thinking about it, I don't remember a better name for the generated bits. I checked some technical papers: those bits are often called "CRC bits" or just the "CRC". Like you mentioned, "checksum" is very popular, too, maybe due to the lack of a better name. Again Wikipedia: Checksum[^]: "This article is about checksums calculated using addition. The term 'checksum' is sometimes used in a more general sense to refer to any kind of redundancy check."
 
As everyone understands what you mean by the term "checksum": no matter, ignore my post above Wink | ;) (don't want to be captious)
 
btw: I'd call CRC an error detection code.
GeneralRe: Yep, checksummemberreinux9 Oct '05 - 21:02 
Smile | :)

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 8 Oct 2005
Article Copyright 2005 by Rei Miyasaka
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid