Click here to Skip to main content
14,735,275 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi all. I've got an someArray(240) As Byte which I need to unpack into Int32 and Int16 integers. I know before I unpack:
1. that all data in someArray are integer values
2. what the startposition of each integer is in someArray
3. what the length of each integer is: Int32 or Int16

Currently I have a function which unpacks in the following way:
Function Unpack(someArray() As Byte) As Integer()
	Dim i32(4) As Byte
	Dim i16(2) As Byte
	Dim icount As Integer = 75

	Dim ivalues(icount) As Integer

	'field1: pos=0 len=4
	Array.Copy(someArray,0,i32,0,4)
	ivalues(0) = Int32FromByteArray(i32,bigEndian)

	'field2: pos=4 len=2
	Array.Copy(someArray,4,i16,0,2)
	ivalues(1) = Int16FromByteArray(i16,bigEndian)

	'field3: pos=6 len=4
	Array.Copy(someArray,6,i32,0,4)
	ivalues(2) = Int32FromByteArray(i32,bigEndian)

	'etc... until all 75 integers are unpacked from someArray

	Return ivalues
End Function


Functions Int32FromByteArray and Int16FromByteArray are functions which use the BitConverter class to convert from byte to integer taking into account that the number has to be re-ordered from BigEndian to LittleEndian.

Now I realize this is a very 'hard coded' way to unpack but I am able to do this because the positions and lengths of the integers in someArray are standardized.

My question: is there a way to unpack the byte array someArray directly into the integer array ivalues? Something like ivalues = someArray.SplitToInteger(someMask)?

Obviously the SplitToInteger function should do something more clever that my current Unpack function does. And obviously it should speed up the unpacking process.

Answers in C# are welcome too of course. Thanks in advance for your thoughts...

Follow-up 1:
I picked solutions 1, 2 and 4 but thanks for the reactions of everyone. I tested out the code of phil.o (solution 1) and it surely works in C#. I'm working now on the conversion to VB.NET which is not trivial because the bit shift operators work a bit differently for the two languages (for those interested: .net - Binary Shift Differences between VB.NET and C# - Stack Overflow). I will combine phil.o extensions with the Unpack function from F.Xaver in solution 4 but not use the BitConverter class for reasons of Endianess. I will post my wrap up in VB.NET here once I got the VB bitshi(f)t going. Tnx!!

Follow-up 2:
And here it is in VB.NET. The trick in VB with the bitshift is to cast byte first with CInt and then apply the shift (this was already mentioned in the article referenced above, but I didn't pick it up (late) last night).
The big and little Int32 extensions in VB.NET are (btw the name someArray has changed to traceHeader):
<Extension()>
Public Function GetInt32BigEndian(traceHeader() As Byte, pos As Integer) As Integer
	Dim result As Integer = CInt(traceHeader(pos+0)) << 24 Or _
		CInt(traceHeader(pos+1)) << 16 Or _
		CInt(traceHeader(pos+2)) << 8 Or _
		CInt(traceHeader(pos+3))
	Return result
End Function

<Extension()>
Public Function GetInt32LittleEndian(traceHeader() As Byte, pos As Integer) As Integer
	Dim result As Integer = CInt(traceHeader(pos+3)) << 24 Or _
		CInt(traceHeader(pos+2)) << 16 Or _
		CInt(traceHeader(pos+1)) << 8 Or _
		CInt(traceHeader(pos))
	Return result
End Function


I'm going to post a solution myself which I picked up, using Linq. I wonder what you have to say about that...?

Follow-up 3 and close out:

I have tested the answers processing a 6GB file containing 3,506,820 of byte arrays described above (called someArray and traceHeader). The results are:
1. for 3,506,820 traceHeaders using my original code: 48.04 seconds
2. for the same using solution 1 (bit shifting): 25.28 seconds
3. for the same using my solution 5 using linq: 185.31 seconds
So, solution 1 is the clear winner. My 'solution 5' with linq is not a solution.

Rgds and thanks
Posted
Updated 22-Jan-16 3:27am
v4

I think of two extension methods which would extract Int32 and Int16 respectively, using binary shift operations to get the results:
public static int GetInt32(this byte[] array, int pos) {
   int result = 0;
   result |= array[pos++] << 24;
   result |= array[pos++] << 16;
   result |= array[pos++] << 8;
   result |= array[pos];
   return result;
}

public static short GetInt16(this byte[] array, int pos) {
   short result = 0;
   result |= array[pos++] << 8;
   result |= array[pos];
   return result;
}


Usage:
int field1 = myArray.GetInt32(0);
short field2 = myArray.GetInt16(4);
int field3 = myArray.GetInt32(6);
// etc.


Of course, these extension methods must be declared in a static class.
Hope this helps. Regards.

PS: I did not include the array position validation.
   
v2
Comments
veen_rp 21-Jan-16 5:57am
   
Thanks, sounds good. I didn't think of an extension yet. I'll wait for some more reactions, but I'm going to try this out to see if it will speed up things.
Rgds.
phil.o 21-Jan-16 6:03am
   
You're welcome.
There's also one thing I did not take into account: for Int16 version, you may have to cast the right operand of bit-shifting operations to Int16 (I cannot test it as I do not have any IDE on my actual computer).
And sorry for the wrong language, I confess I've not been careful enough.
Sascha Lefèvre 21-Jan-16 5:58am
   
psst... he probably expects VB :)
... or not :)
My 5.
phil.o 21-Jan-16 6:02am
   
Thanks Sascha :)
veen_rp 21-Jan-16 6:03am
   
As I said in my original question: "Answers in C# are welcome too of course"... Just takes me a little longer, but I'll get there... :)
Sascha Lefèvre 21-Jan-16 6:07am
   
Oh right, didn't see that :)

The major performance opportunity comes from not copying from your source array into 4 or 2-byte arrays for each conversion but to have a "cursor" inside the source array and reading directly from there. So this is basically the best solution you can get without utilizing "unsafe" pointers.
veen_rp 21-Jan-16 6:43am
   
I think Sascha is right in pointing out the performance penalty is in the array.copy
Thanks sofar, in the coming time I will try this out.
veen_rp 21-Jan-16 6:48am
   
Also, the advantage of phil.o approach is that I can write dedicate extensions for BigEndian and LittleEndian. Hmm, interesting, I'll ponder over this...
Well...
int len = bytes.Length / 4;
int[] ints = new int[len];
int inp = 0;
for (int i = 0; i < len; i++)
    {
    ints[i] = (bytes[inp++] << 24) + (bytes[inp++] << 16) + (bytes[inp++] << 8) + bytes[inp++];
    }

You could get it faster by using unsafe code and pointers to avoid the array indexing, but it might not speed it up that much - depends on the optimiser and you'd need to time it to be sure.
   
I probably would have (in a C# sense)

a) used BitConverter (like you)

but the differences, ...

b) Defined a Class representing the 'target structure' for unpacking into (just before you bite my head off, read on)
c) Used Custom Attributes on the Class from (b) to define the positions to extract the data from into each item of the target class
d) Then used reflection to run a loop to unpack the data

would this be any better than your method - I dont know - I'd have to see what other people say about BitConverter, as long as one takes into account 'Endian-ness' I dont see it as an issue

Quote:
My question: is there a way to unpack the byte array someArray directly into the integer array ivalues? Something like ivalues = someArray.SplitToInteger(someMask)?
not that I've seen

Quote:
Obviously the SplitToInteger function should do something more clever that my current Unpack function does. And obviously it should speed up the unpacking process.


more Clever ? why ?? how about adhering to KISS ? - and I think BitConverter is 'fast enough' unless you're going to go and NEED to look at Bit Twiddling - in fact, have a look at this, somewhere there there's a speed comparison parsing - What is the idiomatic c# for unpacking an integer from a byte array? - Stack Overflow[^]

Probably not what you were looking for
   
Comments
veen_rp 21-Jan-16 6:11am
   
But thanks anyway! KISS is good, KISS + quicker is better... I indeed already have defined a class representing the target structure, I left it out for sake of simplicity wrt the question. This is, together with BitConverter already quite fast. I just want it faster...
Cheers!
shouldn't something like this do the trick?

Public Function Unpack(someArray() As Byte) As Integer()
       Dim pos As Integer() = {0, 4, 6,......}        'startpositions
       Dim pos_type As Boolean() = {True, False, True,.....}  'True for int32, false for int16

       Dim target As New List(Of Integer)

       For x As Integer = 0 To pos.Length
            If pos_type(x) Then
                 target.Add(BitConverter.ToInt32(someArray, pos(x)))
            Else
                 target.Add(BitConverter.ToInt16(someArray, pos(x)))
            End If
       Next

       Return target.ToArray
  End Function
   
Comments
veen_rp 21-Jan-16 6:51am
   
This looks good as well, I'll try it out. Though I think, having to combine BitConverter with Endianness, I'm better of with solution 1. Thanks anyway!
F. Xaver 21-Jan-16 10:43am
   
a mix of both may bee a good idea, 75 copy & paste like lines aren't that good lookin ;)
veen_rp 21-Jan-16 12:19pm
   
True. I'm in the middle of testing stuff out. Let you know what the result is (any month now...)
veen_rp 22-Jan-16 1:44am
   
I picked answer 1 and 4 but thanks for the reactions of everyone. I tested out the code of phil.o and it surely works in C#. I'm working now on the conversion to VB.NET which is not trivial because the bit shift operators work a bit differently for the two languages (for those interested: http://stackoverflow.com/questions/8151333/binary-shift-differences-between-vb-net-and-c-sharp). I will combine phil.o extensions with the Unpack function from F.Xaver in solution 4 but not use the BitConverter class for reasons of Endianess. I will post my wrap up in VB.NET here once I got the VB bitshi(f)t going. Tnx!!
As promised, using Linq:

Dim pos As Integer = 0 'or some other cursor position in the byte array

Dim nrFromLittleEndian As Integer = BitConverter.ToInt32(trcHeader.Skip(pos).Take(4).ToArray,0)

Dim nrFromBigEndian As Integer = BitConverter.ToInt32(trcHeader.Skip(pos).Take(4).Reverse().ToArray,0)


I do not know (yet) if this will be as fast as the bitshifting but the code sure looks elegant...

Rgds,
   
Comments
veen_rp 22-Jan-16 8:18am
   
I tried this out myself. Though elegant, the execution is slow!! In fact much slower than my initial code. See the update of my original question for the timing of processing 3,506,820 byte arrays like this...
veen_rp 22-Jan-16 8:29am
   
I leave this solution in because of the use of linq that was new to me. But this is not a solution. See my follow-up 3 in the original post

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900