Click here to Skip to main content
15,888,113 members
Articles / Programming Languages / C# 4.0
Tip/Trick

High performance C# byte array to hex string to byte array

Rate me:
Please Sign up or sign in to vote.
4.61/5 (16 votes)
27 Aug 2012Ms-PL3 min read 99.7K   26   18
High performance C# byte array to hex string to byte array.

Introduction 

Conversion of bytes to hex string and vice versa is a common task with a variety of implementations. The performance key point for each to/from conversion is the (perpetual) repetition of the same if blocks and calculations that is the standard approach for all implementations I've seen. I'm using unsafe code in an effort to achieve the best possible performance. My concept is based on direct assignments. 

Byte array to hex string

By definition, the result of a byte array to hex string conversion is a combination/repetition of static values (0-255 or "00"-"FF"). In other words we can avoid calculations as well as formatting (leading zero for bytes 0-9). Due to the fact that each byte will be transformed to two characters (2 x 2 bytes), we may say that each byte is assigned to a 32-bit integer. Therefore for n bytes we'll define n integers instead of n*2 characters. 

Hex string to byte array 

Different case, different facts. Valid values within a hex string are the characters ranges '0'-'9', 'A'-'F' and 'a'-'f'. The existence of upper and lower case characters and the valid character ranges ASCII gaps, remove the concept of assignments by "instinct". Needless to mention that "instinct" is wrong in this case. The solution is two byte arrays with 103 values each where only 22 of them are valid (per array). As a result a pair of characters will become a byte using only 4 if conditions and 1 summation, while most implementations need 12 if conditions, 3 subtructions, 1 left shift or multiplication and 1 summation. 

The code

Source code includes comments line by line. These comments will also help you to easily adapt the code to a safe version that will be faster than any other prêt à porter code. 

C#
using System;
namespace DRDigit
{
    // class is sealed and not static in my personal complete version
    public unsafe sealed partial class Fast
    {
        #region from/to hex
        // assigned int values for bytes (0-255)
        static readonly int[] toHexTable = new int[] {
            3145776, 3211312, 3276848, 3342384, 3407920, 3473456, 3538992, 3604528, 3670064, 3735600,
            4259888, 4325424, 4390960, 4456496, 4522032, 4587568, 3145777, 3211313, 3276849, 3342385,
            3407921, 3473457, 3538993, 3604529, 3670065, 3735601, 4259889, 4325425, 4390961, 4456497,
            4522033, 4587569, 3145778, 3211314, 3276850, 3342386, 3407922, 3473458, 3538994, 3604530,
            3670066, 3735602, 4259890, 4325426, 4390962, 4456498, 4522034, 4587570, 3145779, 3211315,
            3276851, 3342387, 3407923, 3473459, 3538995, 3604531, 3670067, 3735603, 4259891, 4325427,
            4390963, 4456499, 4522035, 4587571, 3145780, 3211316, 3276852, 3342388, 3407924, 3473460,
            3538996, 3604532, 3670068, 3735604, 4259892, 4325428, 4390964, 4456500, 4522036, 4587572,
            3145781, 3211317, 3276853, 3342389, 3407925, 3473461, 3538997, 3604533, 3670069, 3735605,
            4259893, 4325429, 4390965, 4456501, 4522037, 4587573, 3145782, 3211318, 3276854, 3342390,
            3407926, 3473462, 3538998, 3604534, 3670070, 3735606, 4259894, 4325430, 4390966, 4456502,
            4522038, 4587574, 3145783, 3211319, 3276855, 3342391, 3407927, 3473463, 3538999, 3604535,
            3670071, 3735607, 4259895, 4325431, 4390967, 4456503, 4522039, 4587575, 3145784, 3211320,
            3276856, 3342392, 3407928, 3473464, 3539000, 3604536, 3670072, 3735608, 4259896, 4325432,
            4390968, 4456504, 4522040, 4587576, 3145785, 3211321, 3276857, 3342393, 3407929, 3473465,
            3539001, 3604537, 3670073, 3735609, 4259897, 4325433, 4390969, 4456505, 4522041, 4587577,
            3145793, 3211329, 3276865, 3342401, 3407937, 3473473, 3539009, 3604545, 3670081, 3735617,
            4259905, 4325441, 4390977, 4456513, 4522049, 4587585, 3145794, 3211330, 3276866, 3342402,
            3407938, 3473474, 3539010, 3604546, 3670082, 3735618, 4259906, 4325442, 4390978, 4456514,
            4522050, 4587586, 3145795, 3211331, 3276867, 3342403, 3407939, 3473475, 3539011, 3604547,
            3670083, 3735619, 4259907, 4325443, 4390979, 4456515, 4522051, 4587587, 3145796, 3211332,
            3276868, 3342404, 3407940, 3473476, 3539012, 3604548, 3670084, 3735620, 4259908, 4325444,
            4390980, 4456516, 4522052, 4587588, 3145797, 3211333, 3276869, 3342405, 3407941, 3473477,
            3539013, 3604549, 3670085, 3735621, 4259909, 4325445, 4390981, 4456517, 4522053, 4587589,
            3145798, 3211334, 3276870, 3342406, 3407942, 3473478, 3539014, 3604550, 3670086, 3735622,
            4259910, 4325446, 4390982, 4456518, 4522054, 4587590
        };

        public static string ToHexString(byte[] source)
        {
            return ToHexString(source, false);
        }
        
        // hexIndicator: use prefix ("0x") or not
        public static string ToHexString(byte[] source, bool hexIndicator)
        {
            // freeze toHexTable position in memory
            fixed (int* hexRef = toHexTable)
            // freeze source position in memory
            fixed (byte* sourceRef = source)
            {
                // take first parsing position of source - allow inline pointer positioning
                byte* s = sourceRef;
                // calculate result length
                int resultLen = (source.Length << 1);
                // use prefix ("Ox")
                if (hexIndicator)
                    // adapt result length
                    resultLen += 2;
                // initialize result string with any character expect '\0'
                string result = new string(' ', resultLen);
                // take the first character address of result
                fixed (char* resultRef = result)
                {
                    // pairs of characters explain the endianess of toHexTable
                    // move on by pairs of characters (2 x 2 bytes) - allow inline pointer positioning
                    int* pair = (int*)resultRef;
                    // use prefix ("Ox") ?
                    if (hexIndicator)
                        // set first pair value
                        *pair++ = 7864368;
                    // more to go
                    while (*pair != 0)
                        // set the value of the current pair and move to next pair and source byte
                        *pair++ = hexRef[*s++];
                    return result;
                }
            }
        }
        
        // values for '\0' to 'f' where 255 indicates invalid input character
        // starting from '\0' and not from '0' costs 48 bytes
        // but results 0 subtructions and less if conditions
        static readonly byte[] fromHexTable = new byte[] {
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 0, 1,
            2, 3, 4, 5, 6, 7, 8, 9, 255, 255,
            255, 255, 255, 255, 255, 10, 11, 12, 13, 14, 
            15, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 10, 11, 12,
            13, 14, 15
        };

        // same as above but valid values are multiplied by 16
        static readonly byte[] fromHexTable16 = new byte[] {
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 0, 16,
            32, 48, 64, 80, 96, 112, 128, 144, 255, 255,
            255, 255, 255, 255, 255, 160, 176, 192, 208, 224, 
            240, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 160, 176, 192,
            208, 224, 240
        };

        public static byte[] FromHexString(string source)
        {
            // return an empty array in case of null or empty source
            if (string.IsNullOrEmpty(source))
                return new byte[0]; // you may change it to return null
            if (source.Length % 2 == 1) // source length must be even
                throw new ArgumentException();
            int
                index = 0, // start position for parsing source
                len = source.Length >> 1; // initial length of result
            // take the first character address of source
            fixed (char* sourceRef = source)
            {
                if (*(int*)sourceRef == 7864368) // source starts with "0x"
                {
                    if (source.Length == 2) // source must not be just a "0x")
                        throw new ArgumentException();
                    index += 2; // start position (bypass "0x")
                    len -= 1; // result length (exclude "0x")
                }
                byte add = 0; // keeps a fromHexTable value
                byte[] result = new byte[len]; // initialization of result for known length
                // freeze fromHexTable16 position in memory
                fixed (byte* hiRef = fromHexTable16)
                // freeze fromHexTable position in memory
                fixed (byte* lowRef = fromHexTable)
                // take the first byte address of result
                fixed (byte* resultRef = result)
                {
                    // take first parsing position of source - allow inremental memory position
                    char* s = (char*)&sourceRef[index];
                    // take first byte position of result - allow incremental memory position
                    byte* r = resultRef;
                    // source has more characters to parse
                    while (*s != 0)
                    {
                        // check for non valid characters in pairs
                        // you may split it if you don't like its readbility
                        if (
                            // check for character > 'f'
                            *s > 102 ||
                            // assign source value to current result position and increment source position
                            // and check if is a valid character
                            (*r = hiRef[*s++]) == 255 ||
                            // check for character > 'f'
                            *s > 102 ||
                            // assign source value to "add" parameter and increment source position
                            // and check if is a valid character
                            (add = lowRef[*s++]) == 255
                            )
                            throw new ArgumentException();
                        // set final value of current result byte and move pointer to next byte
                        *r++ += add;
                    }
                    return result;
                }
            }
        }
        #endregion
    }
}

Performance  

CLR provides a method for generating a hex string from a byte array that I’ve met in many sources:

C#
string hex = BitConverter.ToString(myByteArray).Replace("-", "");

This is probably the worst choice performance wise. Anyway; my implementation is more than 10 times (10x or 1000%) faster and consumes 5 times less memory.

I was looking for a very recent published implementation of hex string to byte array and I found this one at MSDN blogs. I know that this is not the best implementation ever, but it’s one of the best samples for what I've written above. This time, my implementation is about 5 times (5x or 500%) faster.

The code (console app) I used for performance testing is the following:

C#
// a byte array from a file (downloaded from http://hystad.com/words)
byte[] exp = File.ReadAllBytes(@"F:\words.txt");
System.Diagnostics.Stopwatch clock;
long memory = 0;
 
// repeat the test 10 times (I actually repeat it 1000 times)
for (int n = 0; n < 10; n++)
{
	// test for my implementation of toHexString
	clock = Stopwatch.StartNew(); // syntax compatible with older CLR versions
	memory = GC.GetTotalMemory(true);
	string s1 = DRDigit.Fast.ToHexString(exp, false);
	clock.Stop();
	memory = GC.GetTotalMemory(false) - memory;
	Console.Write("{0} [{1}] vs ", clock.Elapsed.TotalMilliseconds, memory);
 
	// test for bit converter
	clock = Stopwatch.StartNew();
	m = GC.GetTotalMemory(true);
	string s2 = BitConverter.ToString(exp).Replace("-", "");
	clock.Stop();
	memory = GC.GetTotalMemory(false) - memory;
	Console.WriteLine("{0} [{1}] -> {2} ", 
	        clock.Elapsed.TotalMilliseconds, memory, s1 == s2);


	// test for my implementation of fromHexString
	clock = Stopwatch.StartNew();
	byte[] b1 = DRDigit.Fast.FromHexString(s1);
	clock.Stop();
	Console.Write("fromHex: {0} vs ", clock.Elapsed.TotalMilliseconds);

	// test for MSDN blog peeked implementation
	clock = Stopwatch.StartNew();
	byte[] b2 = Utils.ConvertToByteArray(s1);
	clock.Stop(); Console.WriteLine(clock.Elapsed.TotalMilliseconds);

	Console.WriteLine("");	
}
Console.ReadLine();

These are the results of running the above code in release mode:

toHex
memory consumption
toHex
execution time in ms
fromHex
execution time in ms
toHexStringBitConverter toHexStringBitConverter fromHexStringMSDN blog peeked
implementation
9,956,99239,795,168 53.5169266.2852 22.757392.5146
9,948,80039,795,168 14.0968254.9983 18.237586.3444
9,948,80039,795,168 21.0198274.6193 18.1744110.9573
9,948,80039,795,168 21.1050275.9161 18.1831122.8697
9,948,80039,795,168 19.6050244.4094 16.7560 110.7480 
9,948,80039,795,168 21.0470275.6667 18.245290.8027
9,948,80039,795,168 20.9721275.5815 18.2375110.6746
9,948,80039,795,168 20.8356276.2732 18.1651122.4027
9,948,80039,795,168 21.2687273.4108 18.188790.1535
9,948,80039,795,168 15.5049254.4780 18.171886.2561

I'm not so sure that I'm allowed to paste the code of BitConverter.ToString as a result of decompiling using reflector. What I can do is to describe what it does affecting the performance. 

Assume a byte array filled with 100 zeros. This is translated to 100 divisions (0 / 16), 100 modulos (0 % 16) and 200 conversions of 0 to '0' requiring 200 if conditions (0 < 10) and 200 summations (0 + 48 to generate char '0').

Using the code  

This is what you expected, isn't it?

C#
// byte array to hex string with "0x" prefix 
string myHexString = DRDigit.Fast.ToHexString(myByteArray, true);  
 
// byte array to hex string without "0x" prefix 
string myHexString = DRDigit.Fast.ToHexString(myByteArray);  
 
// hex string to byte array
byte[] myByteArray = DRDigit.Fast.FromHexString(myHexString);  

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)



Comments and Discussions

 
GeneralMy vote of 5 Pin
Ragheed Al-Tayeb28-Sep-21 2:14
Ragheed Al-Tayeb28-Sep-21 2:14 
GeneralMy vote of 4 Pin
X t r eme r t X24-Jan-17 3:01
professionalX t r eme r t X24-Jan-17 3:01 
QuestionGreat Pin
irneb27-Oct-16 21:12
irneb27-Oct-16 21:12 
QuestionMy vote of 5 Pin
Zhou Tony10-Jul-15 23:36
Zhou Tony10-Jul-15 23:36 
QuestionBenchmark is wrong Pin
suq madiq28-Aug-12 7:38
suq madiq28-Aug-12 7:38 
AnswerRe: Benchmark is wrong Pin
the retired28-Aug-12 8:30
the retired28-Aug-12 8:30 
GeneralMy vote of 4 Pin
Andreas Gieriet27-Aug-12 17:56
professionalAndreas Gieriet27-Aug-12 17:56 
AnswerRe: My vote of 4 Pin
the retired27-Aug-12 21:07
the retired27-Aug-12 21:07 
GeneralRe: My vote of 4 Pin
Andreas Gieriet27-Aug-12 23:18
professionalAndreas Gieriet27-Aug-12 23:18 
GeneralMy vote of 3 Pin
Ed Nutting27-Aug-12 10:24
Ed Nutting27-Aug-12 10:24 
QuestionHow this is faster? Pin
pramod.hegde27-Aug-12 9:07
professionalpramod.hegde27-Aug-12 9:07 
AnswerRe: How this is faster? Pin
the retired27-Aug-12 9:38
the retired27-Aug-12 9:38 
SuggestionRe: How this is faster? Pin
Ed Nutting27-Aug-12 10:24
Ed Nutting27-Aug-12 10:24 
Hi there,

From the outset I'll agree with everything you say except the following bit of your argument:

"...for anyone who has seen the implementation of BitConverter.ToString (using reflector) [it] is more than obvious that my implementation is by far faster..."

That doesn't wash. CP is for teaching people and you cannot assume that someone reading this has seen the inside workings of the .Net BitConverter - I've written my own networking library using it and never seen inside it, why should anyone else!? You cannot just assume knowledge that is the crux of your statement (that your code is faster). Please update your tip/trick to include at least a link to the code that is behind the BitConverter class (if that's possible. If not a sample of it would be good).

After that issue, it's more about what your article doesn't say that bothers me.

Overall, your tip/trick is good, but you probably should have posted the above requested info and run your own tests and posted the results of those tests. In the really good articles/tip/tricks that I've seen on CP, that having anything to do with performance, the author has done their own extensive tests and posted a table of results. It is also notable that you haven't included any comparison to other versions of this operation that are available on the internet.

My final word is that nowhere in your tip/trick did you mention that your comparison was with the BitConverter class. It was only in reading your message above that I noticed that that was what you were claiming to be faster than. Your code looks good, your article doesn't look as good though. Please improve this.

For now I give you a vote of 3. If the article improves, message me back and I'll up-vote probably to a 5 Smile | :)
Ed
AnswerRe: How this is faster? Pin
the retired27-Aug-12 10:37
the retired27-Aug-12 10:37 
GeneralRe: How this is faster? Pin
Andreas Gieriet27-Aug-12 10:51
professionalAndreas Gieriet27-Aug-12 10:51 
GeneralRe: How this is faster? Pin
Ed Nutting27-Aug-12 10:53
Ed Nutting27-Aug-12 10:53 
AnswerRe: How this is faster? Pin
the retired27-Aug-12 12:51
the retired27-Aug-12 12:51 
GeneralRe: How this is faster? Pin
Ragheed Al-Tayeb28-Sep-21 2:08
Ragheed Al-Tayeb28-Sep-21 2:08 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.