Click here to Skip to main content
Click here to Skip to main content

C# RIFF Parser

, 6 Jun 2005
Rate this:
Please Sign up or sign in to vote.
Decode Resource Interchange Files (AVI, WAV, RMID...) using this pure C# parser.

RiffParserDemo2 Windows GUI

Windows App demo

RiffParserDemo console app

Console App demo

Introduction

Ever wondered how to access the information in an AVI file or wanted to extract information directly from a WAV file? Most people refer external COM objects (like AVIFIL32.DLL) to access the information (see A Simple C# Wrapper for the AviFile Library). The RIFF Parser allows you to access the resource information stored in the files directly by C#.

Background

What is a RIFF file?

RIFF is the Resource Interchange File Format. This is a general purpose format for exchanging multimedia data types that was defined by Microsoft and IBM during their long forgotten alliance.

RIFF File Format

A RIFF file consists of a RIFF header followed by zero or more lists (of chunks and other lists) and chunks (of data). For a specific example, see the description of an AVI RIFF form, below:

The RIFF header has the following form:

'RIFF' fileSize fileType (data)

where 'RIFF' is the literal FourCC code 'RIFF', fileSize is a four byte value giving the size of the data in the file, and fileType is a FourCC that identifies the specific file type. The value of fileSize includes the size of the fileType FourCC plus the size of the data that follows, but does not include the size of the 'RIFF' FourCC or the size of fileSize. The file data consists of chunks and lists, in any order.

A chunk has the following form:

ckID ckSize ckData

where ckID is a FourCC that identifies the data contained in the chunk, ckData is a four byte value giving the size of the data in ckData, and ckData is zero or more bytes of data. The data is always padded to nearest WORD boundary. ckSize gives the size of the valid data in the chunk; it does not include the padding, the size of ckID, or the size of ckSize.

A list has the following form:

'LIST' listSize listType listData

where 'LIST' is the literal FourCC code 'LIST', listSize is a four byte value giving the size of the list, listType is a FourCC code, and listData consists of chunks or lists, in any order. The value of listSize includes the size of listType plus the size of listData; it does not include the 'LIST' FourCC or the size of listSize.

FourCCs

A FourCC (four-character code) is a 32-bit unsigned integer created by concatenating four ASCII characters. For example, the FourCC 'abcd' is represented on a Little-Endian system as 0x64636261. FourCCs can contain space characters, so ' abc' is a valid FourCC. The RIFF file format uses FourCC codes to identify stream types, data chunks, index entries, and other information.

What is the ‘AVI’ file format?

AVI RIFF Form

AVI files are identified by the FourCC 'AVI ' in the RIFF header. All AVI files include two mandatory LIST chunks, which define the format of the streams and the stream data, respectively. An AVI file might also include an index chunk, which gives the location of the data chunks within the file. An AVI file with these components has the following form:

RIFF ('AVI '
      LIST ('hdrl' ... )
      LIST ('movi' ... )
      ['idx1' (<AVI Index>) ]
     )

The 'hdrl' list defines the format of the data and is the first required LIST chunk. The 'movi' list contains the data for the AVI sequence and is the second required LIST chunk. The 'idx1' list contains the index. AVI files must keep these three components in the proper sequence.

Note: The OpenDML extensions define another type of index, identified by the FourCC 'indx'.

The 'hdrl' and 'movi' lists use subchunks for their data. The following example shows the AVI RIFF form expanded with the chunks needed to complete these lists:

RIFF ('AVI '
      LIST ('hdrl'
            'avih'(<MAIN AVI Header>)
            LIST ('strl'
                  'strh'(<STREAM header>)
                  'strf'(<STREAM format>)
                  [ 'strd'(<ADDITIONAL header data>) ]
                  [ 'strn'(<STREAM name>) ]
                  ...
                 )
             ...
           )
      LIST ('movi'
            {SubChunk | LIST ('rec '
                              SubChunk1
                              SubChunk2
                              ...
                             )
               ...
            }
            ...
           )
      ['idx1' (<AVI Index>) ]
     )

For more information about the AVI format, see John McGowan’s AVI Overview and the OpenDML AVI extensions.

What does the RIFF parser do?

Given a RIFF file, the parser iterates through the various elements in the file, calling your specific delegates when elements are encountered.

Two example programs are provided (both as Visual Studio .NET 2003 solutions):

  • RIFFParserDemo – a console application that outputs all the elements in a given RIFF file.
  • RIFFParserDemo2 – a Windows App that examines RIFF files. If the file examined is an AVI or a WAV, the app displays additional information extracted from the RIFF elements.

Using the RIFF parser

First, create a new RiffParser object.

rp = new RiffParser();

Then, attempt to open the RIFF file.

rp.OpenFile(filename);

If no exceptions were thrown, the file is a valid RIFF file and you can access file type and format information by accessing FileRIFF and FileType. Note that, the file RIFF format and file type are FourCC codes. To read the codes in string format, use the FromFourCC static method:

public static string FromFourCC(int FourCC)

For example:

txtFileFormat.Text = RiffParser.FromFourCC(rp.FileRIFF);
txtFileType.Text = RiffParser.FromFourCC(rp.FileType);

Once the file type is established, read the elements in the file using the ReadElement() method.

public bool ReadElement(ref int bytesleft, 
         ProcessChunkElement chunk, ProcessListElement list)

The ReadElement() method takes the following arguments:

  • A ref int specifying the number of bytes left in the current data chunk (initially, the length of data in the file).
  • A delegate to be called when a chunk element is encountered.
  • A delegate to be called when a list element is encountered.

The method returns false when the end of data is reached.

Why is the bytesleft parameter passed by reference? The byte count is reduced to correctly represent the amount of data left in the current list/chunk. Passing the byte count by reference allows the method caller to possibly skip the rest of the data at this 'child' level and go on to read the next 'parent' level element.

An example using ReadElement():

int length = Parser.DataSize;

RiffParser.ProcessChunkElement pdc = 
     new RiffParser.ProcessChunkElement(ProcessAVIChunk);
RiffParser.ProcessListElement pal = 
    new RiffParser.ProcessListElement(ProcessAVIList);

while (length > 0) 
{
    if (false == Parser.ReadElement(ref length, pdc, pal)) break;
}

When done processing the file, call CloseFile().

Handling RIFF elements

Handling chunk data

public delegate void ProcessChunkElement(RiffParser rp, int FourCCType, 
       int unpaddedLength, int paddedLength);

When the ProcessChunkElement delegate is called, the method is called with four arguments:

  • A reference to the RiffParser making the call.
  • An int specifying the FourCC code for the chunk.
  • Two ints specifying the unpadded and padded length for the chunk data. RIFF data is always WORD aligned, so even if the chunk contains an odd number of bytes, an even number of bytes must be skipped to access the next element.

The chunk data can either be read or skipped, depending on the circumstance.

Read a chunk:

if (AviRiffData.ckidAVIISFT == FourCC)
{
    Byte[] ba = new byte[paddedLength];
    rp.ReadData(ba, 0, paddedLength);
    StringBuilder sb = new StringBuilder(unpaddedLength);
    for (int i = 0; i < unpaddedLength; ++i) 
    {
        if (0 != ba[i]) sb.Append((char)ba[i]);
    }

    m_isft = sb.ToString();
}

Skip a chunk:

// Unknon chunk - skip
rp.SkipData(paddedLength);

Handling LIST data

public delegate void ProcessListElement(RiffParser rp, int FourCCType, int length);

When the ProcessListElement() delegate is called, the method is called with three arguments:

  • A reference to the calling RiffParser.
  • An int specifying the FourCC code for the list.
  • An int containing the length of the list data.

The list can then be skipped,

rp.SkipData(length);

or each element can be processed by calling ReadElement(), possibly with new delegates to handle the elements in the list.

RiffParser.ProcessChunkElement pnc = 
    new RiffParser.ProcessChunkElement(ProcessNestedChunk);
RiffParser.ProcessListElement pnl = 
    new RiffParser.ProcessListElement(ProcessNestedList);

while (length > 0) 
{
    if (false == rp.ReadElement(ref length, pnc, pnl)) break;
}

FourCC conversions

Four static methods are available to ease conversion from and to FourCC ints:

public static string FromFourCC(int FourCC)
public static int ToFourCC(string FourCC)
public static int ToFourCC(char[] FourCC)
public static int ToFourCC(char c0, char c1, char c2, char c3)

The method I use most is FromFourCC().

// AVI section FourCC codes
public static readonly int ckidAVIHeaderList = RiffParser.ToFourCC("hdrl");
public static readonly int ckidMainAVIHeader = RiffParser.ToFourCC("avih");
public static readonly int ckidODML = RiffParser.ToFourCC("odml");
public static readonly int ckidAVIExtHeader = RiffParser.ToFourCC("dmlh");
public static readonly int ckidAVIStreamList = RiffParser.ToFourCC("strl");
public static readonly int ckidAVIStreamHeader = RiffParser.ToFourCC("strh");
public static readonly int ckidStreamFormat = RiffParser.ToFourCC("strf");
public static readonly int ckidAVIOldIndex = RiffParser.ToFourCC("idx1");
public static readonly int ckidINFOList = RiffParser.ToFourCC("INFO");
public static readonly int ckidAVIISFT = RiffParser.ToFourCC("ISFT");

Unsafe and Fixed – are they needed?

RIFF files are binary files. Attempting to read RIFF files one character at a time results in a great performance impact. The data structures stored in the files are designed to be loaded in to memory and then be referenced using fixed-size C structs. For example, an AVIMAINHEADER struct is defined as:

typedef struct _avimainheader {
    FourCC fcc;
    DWORD  cb;
    DWORD  dwMicroSecPerFrame;
    DWORD  dwMaxBytesPerSec;
    DWORD  dwPaddingGranularity;
    DWORD  dwFlags;
    DWORD  dwTotalFrames;
    DWORD  dwInitialFrames;
    DWORD  dwStreams;
    DWORD  dwSuggestedBufferSize;
    DWORD  dwWidth;
    DWORD  dwHeight;
    DWORD  dwReserved[4];
} AVIMAINHEADER;

In C++ (or C) you would:

Private void DecodeAVIHeader(IOStream& stream)
{
    char[] data = new char[sizeof(AVIMAINHEADER)];

    stream.Read(data, sizeof(AVIMAINHEADER));

    AVIMAINHEADER* avi = (AVIMAINHEADER*)data;
    // Reference the struct members directly
    int totalFrames = avi->dwTotalFrames;
    …
}

But in C#, in managed code – we cannot do such tricks. Are we limited to reading a single byte at a time and doing a lot of work to decode the data?

This is where fixed and /unsafe come in. The fixed keyword allows us to ‘fix’ a piece of managed data in memory, guaranteeing that the data will not be moved or collected by the memory manager. Once the data is fixed in memory, pointers to the data can be (relatively safely) manipulated and the data directly accessed. fixed is like the Unix pin and unpin wrapped in a using directive. Using fixed requires compiling with the /unsafe switch (or setting ‘Allow Unsafe Code Blocks’ to true in the Visual Studio project Configuration Properties page).

private unsafe void DecodeAVIHeader(RiffParser rp, int unpaddedLength, int length)
{
byte[] ba = new byte[length];

    rp.ReadData(ba, 0, length);

    fixed (Byte* bp = &ba[0]) 
    {
        AVIMAINHEADER* avi = (AVIMAINHEADER*)bp;
        m_frameRate = avi->dwMicroSecPerFrame;
    …
    }
}

The managed data structure remains at the same memory location and is safe from collection as long as we are in the fixed block. Nothing is guaranteed once we leave the fixed block, so please do not keep any references to pointers or data that might no longer be there! Copy out the needed data and use the copy once outside the fixed block.

Reading RIFF data (file access)

Reading the RIFF header

// Read the RIFF header
m_stream = new FileStream(m_filename, FileMode.Open, 
     FileAccess.Read, FileShare.Read);
int FourCC;
int datasize;
int fileType;

ReadTwoInts(out FourCC, out datasize);
ReadOneInt(out fileType);

Reading a RIFF element.

int FourCC;
int size;

ReadTwoInts(out FourCC, out size);

...

// Examine the element, is it a list or a chunk
string type = FromFourCC(FourCC);
if (0 == String.Compare(type, LIST4CC))
{
    // We have a list
    ReadOneInt(out FourCC);

    if (null == list)
    {
        SkipData(size - 4);
    }
    else
    {
         // Invoke the list method
         list(this, FourCC, size - 4);
    }

    // Adjust size
    bytesleft -= size;
}
else
{
    // Calculated padded size - padded to WORD boundary
    int paddedSize = size;
    if (0 != (size & 1)) ++paddedSize;

    if (null == chunk)
    {
        SkipData(paddedSize);
    }
    else
    {
        chunk(this, FourCC, size, paddedSize);
    }

    // Adjust size
    bytesleft -= paddedSize;
}

Reading two ints (note use of the unsafe and fixed keywords).

public unsafe void ReadTwoInts(out int FourCC, out int size)
{
  try {
    int readsize = m_stream.Read(m_eightBytes, 0, TWODWORDSSIZE);

    if (TWODWORDSSIZE != readsize) {
      throw new RiffParserException("Unable to read. Corrupt RIFF file " + 
         FileName);
    }

    fixed (byte* bp = &m_eightBytes[0]) {
      FourCC = *((int*)bp);
      size = *((int*)(bp + DWORDSIZE));
    }
  }
  catch (Exception ex)
  {
    throw new RiffParserException("Problem accessing RIFF file " + FileName, ex);
  }
}

A basic RIFF parser

Following is the complete source code for a simple parser which displays all the elements in a RIFF file:

using System;
using System.Text;

namespace RiffParserDemo
{
    class RiffParserDemo
    {
        // Parse a RIFF file
        static void Main(string[] args)
        {
            // Create a parser instance
            RiffParser rp = new RiffParser();
            try 
            {
                string filename = @"C:\Program Files\Microsoft" +
                   " Visual Studio .NET 2003\Common7\Graphics\videos\BLUR24.avi";
                //string filename = @"C:\WINNT\Media\Chimes.wav"
                if (0 != args.Length)  
                {
                    filename = args[0];
                }
                    
                // Specify a file to open
                rp.OpenFile(filename);

                // If we got here - the file is valid. 
                //Output information about the file
                Console.WriteLine("File " + rp.ShortName + 
                    " is a \"" + RiffParser.FromFourCC(rp.FileRIFF)+ 
                    "\" with a specific type of \"" + 
                    RiffParser.FromFourCC(rp.FileType) + "\"");

                // Store the size to loop on the elements
                int size = rp.DataSize;

                // Define the processing delegates
                RiffParser.ProcessChunkElement pc = 
                     new RiffParser.ProcessChunkElement(ProcessChunk);
                RiffParser.ProcessListElement pl = 
                     new RiffParser.ProcessListElement(ProcessList);

                // Read all top level elements and chunks
                while (size > 0)
                {
                    // Prefix the line with the current top level type
                    Console.Write(RiffParser.FromFourCC(rp.FileType) + 
                          " (" + size.ToString() + "): ");
                    // Get the next element (if there is one)
                    if (false == rp.ReadElement(ref size, pc, pl)) break;
                }
                // Close the stream
                rp.CloseFile();
                Console.WriteLine();
            }
            catch (Exception ex)
            {
                Console.WriteLine("-----------------");
                Console.WriteLine("Problem: " + ex.ToString());
            }
            Console.WriteLine("\n\rDone. Press 'Enter' to exit.");
            Console.ReadLine();
        }

        // Process a RIFF list element (list sub elements)
        public static void ProcessList(RiffParser rp, int FourCC, int length)
        {
            string type = RiffParser.FromFourCC(FourCC);
            Console.WriteLine("Found list element of type \"" + 
                  type + "\" and length " + length.ToString());

            // Define the processing delegates
            RiffParser.ProcessChunkElement pc = 
                new RiffParser.ProcessChunkElement(ProcessChunk);
            RiffParser.ProcessListElement pl = 
                new RiffParser.ProcessListElement(ProcessList);

            // Read all the elements in the current list
            try {
                while (length > 0) {
                    // Prefix each line with the type of the current list
                    Console.Write(type + " (" + length.ToString() + "): ");
                    // Get the next element (if there is one)
                    if (false == rp.ReadElement(ref length, pc, pl)) break;
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine("Problem: " + ex.ToString());
            }
        }

        // Process a RIFF chunk element (skip the data)
        public static void ProcessChunk(RiffParser rp, 
              int FourCC, int length, int paddedLength)
        {
            string type = RiffParser.FromFourCC(FourCC);
            Console.WriteLine("Found chunk element of type \"" + 
                type + "\" and length " + length.ToString());

            // Skip data and update bytesleft
            rp.SkipData(paddedLength);
        }
    }
}

Extras

The file AviRiffData.cs contains C# compatible definitions for many AVI and WAV data structures. The file also contains many FourCC constants used in AVI and WAV files.

History

  • 6-Jun-2005

    Version 1.0 - Original release.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

gtamir
Web Developer
United States United States
Giora Tamir has been Architecting, Designing and Developing software and hardware solutions for over 15 years. As an IEEE Senior member and a talented developer, Giora blends software development, knowledge of protocols, extensive understanding of hardware and profound knowledge of both Unix and Windows based systems to provide a complete solution for both defense and commercial applications. Giora, also known as G.T., now holds the position of Principal Engineer for ProfitLine, Inc. architecting the next generation of .NET applications based on a Service-Oriented-Architecture.
 
Gioras areas of interest include distributed applications, networking and cryptography in addition to Unix internals and embedded programming.
 
Founded in 1992, ProfitLine manages hundreds of millions of dollars in annual telecom spend for its prestigious Fortune 1000 client base, such as Merrill Lynch, Charming Shoppes, Macromedia, CNA Financial Corporation, and Constellation Energy Group. ProfitLine's outsourced solution streamlines telecom administrative functions by combining a best practices approach with intelligent technology. For more information about ProfitLine, call 858.452.6800 or e-mail sales@profitline.com.

Comments and Discussions

 
GeneralMy vote of 5 Pinmembermanoj kumar choubey13-Feb-12 0:05 
GeneralGreat Guide Pinmembercubski14-May-11 10:06 
Questionneed your help for my mini project Pinmembersitrarasu26-Jul-09 9:08 
hello sir,
my mini project is about data hiding using LSB Substitution.
i did successfully for BMP and WAV files. but for AVI i am getting unplayable file as output..
i am doing the project in C-language.
please help..
QuestionCorrupt avi file Pinmembergary detweiler3-Mar-09 16:11 
Generalaudio handler always returns 0 PinmemberSeishin#14-Jun-07 10:14 
GeneralGreat Script! Please Help me :D PinmemberZysen(CodeProjectSpam)28-Mar-07 22:52 
QuestionScanning Audio Stream MP3 (CBR/VBR) Help me Pinmemberhmandevteam10-Mar-07 13:14 
GeneralExtracting Content? [modified] Pinmemberjfdoubell10-Jul-06 1:01 
GeneralRe: Extracting Content? Pinmembergtamir19-Jul-06 22:05 
GeneralA way of creating RIFF AVI LIST file PinmemberFaze7916-Jun-06 22:53 
GeneralUsing RiffParser for getting .wav file sampling rate, stereo/mono etc. Pinmemberewoudenberg3-Jan-06 11:11 
QuestionWhy unsafe? PinmemberDarkGraySerge4-Aug-05 12:00 
AnswerRe: Why unsafe? Pinmembergtamir6-Aug-05 13:47 
GeneralRe: Why unsafe? PinmemberDarkGraySerge9-Aug-05 0:09 
GeneralRe: Why unsafe? Pinmembergtamir10-Aug-05 6:44 
GeneralRe: Why unsafe? PinmemberFroyke14-May-07 1:12 
Generalrser PinmemberShashidhar Kamath29-Jun-05 19:10 
GeneralA good Example that C# isnt a real hit for C++ PinmemberKarstenK13-Jun-05 18:32 
GeneralRe: A good Example that C# isnt a real hit for C++ Pinmembergtamir6-Aug-05 13:49 
Generalreally asking!! Pinmembervadivhere6-Jun-05 23:54 
GeneralRe: really asking!! Pinmembergtamir7-Jun-05 5:33 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140709.1 | Last Updated 7 Jun 2005
Article Copyright 2005 by gtamir
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid