Click here to Skip to main content
15,891,864 members
Articles / Programming Languages / C++/CLI
Article

Reading Binary Data from a file

Rate me:
Please Sign up or sign in to vote.
2.25/5 (9 votes)
18 Feb 20043 min read 146.9K   23   14
Allows a developer to use the .NET way of reading binary files.

Introduction

Reading binary data is not something you do every day, however, there are times when you need to read the characters from a file and then parse them.

This article aims to provide the reader with an insight into one way this can be done.

Background

Over the past few years, I have been asked to develop systems that parse binary data. Examples of this includes interfacing into a Rail Signaling System to provide passenger information, and now I am developing traffic volume reporting software that uses historical information from binary flat-files.

Both Rail Signaling Systems and Traffic Management Systems use binary files as a means of passing information to track switching controllers, actuated boom gates and traffic signals. They do this because one byte of data can literally hold the values of 8 different variables. Combining bytes allows the programmer to compact the information being sent to the controllers, making them much more efficient.

So, for example, if you want to send information to a fictitious controller whose purpose is to switch on for 3 seconds and then close, you may have a record format as follows:

STX: 1byte | Controller ID: 2bytes | Action: 1byte | ETX: 1byte

Within the Action byte, you may have something like this:

Bit 7: Controller On|Off, Bit 6: Seconds|Minutes , Bit5-0: Time period for action

The reader will note that using bit0-5 means that at most, the action can occur for 63 time periods, in this case, it is determined by bit 6 which tells us we can have either minutes or seconds.

So, seeing as we want the controller on for 3 seconds, our Action byte will look as follows:

Action = 1100 0011 or in hex 0x0c3

NOTE: The leading zero after the 'x' is not necessary, but I like to use it for clarity. It allows me to separate the 'x' from the letters within hex, thus making it clearer to me at the very least.

We are now ready to start looking at what can be done to read such a file.

Using the code

In here, we will learn how to open a binary file, read a character from it, do some processing, and then close the file.

So, let's start with some code, and then we can explain it from there:

MC++
proc_file(String *filename)
{
  BinaryReader*br = new BinaryReader(File::OpenRead(filename));
  unsigned int c;

  try
  {
    while(true)
    {
      c = (unsigned int)br-ReadByte()
      proc(c);
    }
  }
  catch(EndOfStreamException *e)
  {
    //just close the reader...
    br->Close();
  }
}

We begin with opening the file using the BinaryReader class. Then, set up a try/catch block, within which we create a loop that we know will never end, except that we know that once the loop attempts to read past the EOF, it will fail and throw the EndOfStream exception. The proc(c) function simply processes the character as per the specification you are working to.

BinaryReader has many more 'reader' types including Boolean and Char. This is a class that needs to be explored based on the work you're attempting to complete.

Points of Interest

The most interesting aspect of this is that originally I was using StreamReader to open the file and then used the StreamReader::ReadToEnd() function to collect all the data into a singular String* variable. However, it kept 'skipping' data. I suddenly realized that in the data, it was the BS (^H/0x08) and the LF (^L/0x0a) characters that were missing, and I put that down to the fact that the String* variable was 'acting' on those characters.

I then attempted to use good ol' C, but the .NET environment was not happy about me doing this, and told me so in no uncertain terms. However, congrats to the designers of .NET, they realized that we were going to need this feature, even if it is only rarely, and it has been included through the BinaryReader/Writer classes.

So, my conclusion is that there normally exists a .NET way of doing things, finding it is sometimes a battle, and in terms of reading/writing binary data is concerned, BinaryRead/BinaryWrite are the classes you need.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Software Developer (Senior)
Australia Australia
Nik is developing software again...

Comments and Discussions

 
GeneralNic Please help with this function - Noob Pin
TyroneS7-Jun-06 20:22
TyroneS7-Jun-06 20:22 
AnswerRe: Nic Please help with this function - Noob Pin
Nik Vogiatzis2-Oct-06 0:42
Nik Vogiatzis2-Oct-06 0:42 
hi tyrone,

sorry for taking so long to respond... i am in the middle of writing my thesis...

i will try some psuedocode for you rather than full on VB as it is not my language of trade and i am not that good at it...

tmpBab would have to be a 32-bit number... so, the first thing that is happening at:
TempYear = (tmpBab & 0xfe000000) >> 25; //Get upper most seven bits

here we are taking the top seven bit (the binary for 0xfe000000 = 11111110000000000000000000000000), and by using the 'bitwise and' operator we are zeroing out all the low-order 3 bytes.

then the '>>25" moves the remaining byte to the begining of the integer, that way we will get a number that is 1 byte long. this number then gets added to '1940', to give us the current date. this would be a system specific thing and somewhere along the line someone decided that epoc for this system is 1940.

the line "TimeStamp->operator =(RecodeDateTime(*TimeStamp, TempYear + 1940, 1, 1, 0, 0, 0, 0));" simply creates a timestamp object that has the correct year in it... all the 'function' stuff that is going on is just some library calls necessary to create the timestamp (in C/C++ we don't have all the nice builtin functions and data types available in languages such as VB or SQL)...

the same sort of thing is happening in the line "TimeStamp->operator +=((double)(tmpBab & 0x01ffffff)/(24*60*60)); // 1 second = 1/(24*60*60) days", where they are taking the bottom 24 bits (0x01ffffff=000000001111111111111111111111111) as these give them the number of seconds out of the field passed. this is added to the date and now they have a time stamp...

this is now returned to the calling function...

the big thing to remember is that if you want to 'pull off' a section of an integer then you can use the 'bitwise and' (&) operator to take only those bits you are interested in... if you are taking something from a portion that does not include the lowest order bit, then you need to left-shift the value so that you can place the integer of interest into a variable correctly...

for example, if my number is 75 (dec)= 01001011 (bin)= 4B (hex), and i am interested in the top 4 bits, which my protocol tells me make up the number of nodes in my mini-cluster, then i will need to take the top 4 bits of as follows (psuedo-code-esque VB:

dim myNum as integer<br />
dim newNum as integer<br />
<br />
myNum = 4B (all numbers in C/C++ are expressed either in decimal, octal or hexidecimal, for this sort of stuff we tend to use hexidecimal, but there is no reason why you can't express it as decimal or octal)<br />
<br />
newNum = (myNum & 0x0f0) >> 4 (we have taken the top 4 bits, and moved them down 4 places so it is the correct integer...)


lets pause for a moment to see what would happen if we don't left shift in this case...

myNum & 0x0f0 = 0x40 = 64

and with the left shift we get

myNum & 0x0f0 >> 4 = 0x04 = 4

a significant difference... so if we are only interested in the top 4 bits then the answer is '4' in this case...

if we wanted the lower 4 bits then there would be no need to 'left-shift' at all...

i really hope this has answered your question, and sorry again for taking me so long...

cheers
nik

Nik Vogiatzis
PhD Candidate: University of South Australia
+++++++++++++++++++++++++++
Developing new generation Urban Traffic Control systems

GeneralRe: Nic Please help with this function - Noob Pin
TyroneS2-Oct-06 0:53
TyroneS2-Oct-06 0:53 
GeneralRe: Nic Please help with this function - Noob Pin
Nik Vogiatzis2-Oct-06 0:56
Nik Vogiatzis2-Oct-06 0:56 
GeneralReading binary data! Pin
cheenu_200222-Jul-04 1:00
cheenu_200222-Jul-04 1:00 
GeneralRe: Reading binary data! Pin
Nik Vogiatzis22-Jul-04 14:27
Nik Vogiatzis22-Jul-04 14:27 
GeneralThe easiest way ever is... Pin
Kochise19-Feb-04 21:46
Kochise19-Feb-04 21:46 
GeneralRe: The easiest way ever is... Pin
Nik Vogiatzis20-Feb-04 9:32
Nik Vogiatzis20-Feb-04 9:32 
GeneralOK, I admit I was not 'into' the subject, but... Pin
Kochise22-Feb-04 21:25
Kochise22-Feb-04 21:25 
GeneralRe: OK, I admit I was not 'into' the subject, but... Pin
Nik Vogiatzis22-Feb-04 22:58
Nik Vogiatzis22-Feb-04 22:58 
General"Our management getting excited about it..." Pin
Kochise23-Feb-04 0:14
Kochise23-Feb-04 0:14 
GeneralRe: &quot;Our management getting excited about it...&quot; Pin
Nik Vogiatzis23-Feb-04 18:11
Nik Vogiatzis23-Feb-04 18:11 
GeneralAbout QNX... Pin
Kochise23-Feb-04 21:34
Kochise23-Feb-04 21:34 
GeneralRe: About QNX... Pin
Nik Vogiatzis29-Feb-04 20:01
Nik Vogiatzis29-Feb-04 20:01 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.