Introduction

Reading binary data is not something you do every day, however, there are times when you need to read the characters from a file and then parse them.

This article aims to provide the reader with an insight into one way this can be done.

Background

Over the past few years, I have been asked to develop systems that parse binary data. Examples of this includes interfacing into a Rail Signaling System to provide passenger information, and now I am developing traffic volume reporting software that uses historical information from binary flat-files.

Both Rail Signaling Systems and Traffic Management Systems use binary files as a means of passing information to track switching controllers, actuated boom gates and traffic signals. They do this because one byte of data can literally hold the values of 8 different variables. Combining bytes allows the programmer to compact the information being sent to the controllers, making them much more efficient.

So, for example, if you want to send information to a fictitious controller whose purpose is to switch on for 3 seconds and then close, you may have a record format as follows:

STX: 1byte | Controller ID: 2bytes | Action: 1byte | ETX: 1byte

Within the Action byte, you may have something like this:

Bit 7: Controller On|Off, Bit 6: Seconds|Minutes , Bit5-0: Time period for action

The reader will note that using bit0-5 means that at most, the action can occur for 63 time periods, in this case, it is determined by bit 6 which tells us we can have either minutes or seconds.

So, seeing as we want the controller on for 3 seconds, our Action byte will look as follows:

Action = 1100 0011 or in hex 0x0c3

NOTE: The leading zero after the 'x' is not necessary, but I like to use it for clarity. It allows me to separate the 'x' from the letters within hex, thus making it clearer to me at the very least.

We are now ready to start looking at what can be done to read such a file.

Using the code

In here, we will learn how to open a binary file, read a character from it, do some processing, and then close the file.

So, let's start with some code, and then we can explain it from there:

proc_file(String *filename)
{
  BinaryReader*br = new BinaryReader(File::OpenRead(filename));
  unsigned int c;

  try
  {
    while(true)
    {
      c = (unsigned int)br-ReadByte()
      proc(c);
    }
  }
  catch(EndOfStreamException *e)
  {
    //just close the reader...
    br->Close();
  }
}

We begin with opening the file using the BinaryReader class. Then, set up a try/catch block, within which we create a loop that we know will never end, except that we know that once the loop attempts to read past the EOF, it will fail and throw the EndOfStream exception. The proc(c) function simply processes the character as per the specification you are working to.

BinaryReader has many more 'reader' types including Boolean and Char. This is a class that needs to be explored based on the work you're attempting to complete.

Points of Interest

The most interesting aspect of this is that originally I was using StreamReader to open the file and then used the StreamReader::ReadToEnd() function to collect all the data into a singular String* variable. However, it kept 'skipping' data. I suddenly realized that in the data, it was the BS (^H/0x08) and the LF (^L/0x0a) characters that were missing, and I put that down to the fact that the String* variable was 'acting' on those characters.

I then attempted to use good ol' C, but the .NET environment was not happy about me doing this, and told me so in no uncertain terms. However, congrats to the designers of .NET, they realized that we were going to need this feature, even if it is only rarely, and it has been included through the BinaryReader/Writer classes.

So, my conclusion is that there normally exists a .NET way of doing things, finding it is sometimes a battle, and in terms of reading/writing binary data is concerned, BinaryRead/BinaryWrite are the classes you need.