Click here to Skip to main content
15,884,099 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello.

I'd like to search for specific hexadecimal numbers/values in a binary file. For example I want to search program.exe for "5c 6d 69 6e 67 77 33 32 5c" (just an example). If these values in the exact order they're in now are found in the binary, I want them put in a string so I can display them.

I currently don't have any code because it's quite confusing for me. I apologize for that.

Thanks in advance!
Posted
Comments
PIEBALDconsult 5-Apr-15 16:34pm    
There are no hex values in a binary file.
Member 11478018 5-Apr-15 16:54pm    
Could you explain yourself please?
Sergey Alexandrovich Kryukov 5-Apr-15 21:59pm    
I just tried to explain it for you in my answer. You should become very, very clear and confident on these very basic subjects, otherwise you cannot successfully learn computing.
—SA

1 solution

Yes, you are very much confused with the basics. No need to apologize — it happened to many people in the past, but this is a really big misconception; without dismissing it, you cannot seriously go forward with computing.

First of all, the numbers cannot be hexadecimal or decimal. This is a property of strings representing numbers and not the numbers themselves. The numbers are all "binary". More exactly, most of technology using numbers is kept agnostic to exact computer representation of numbers; and this is really good, because it makes software more portable. Each time you assume particular representation of numbers, you compromise portability.

Nevertheless, there are many cases when you need to assume particular binary representation of numbers. This is how integer values are represented (on almost all systems): http://en.wikipedia.org/wiki/Two%27s_complement (not as simple as you thought, probably, but it makes deep practical sense).

See also my past answer: what is different between signed and unsigned in disassembly code ?.

And floating-point numbers representation is defined in the IEEE 754 standard:
http://en.wikipedia.org/wiki/Floating_point,
http://en.wikipedia.org/wiki/IEEE_floating_point.

Now, you probably tend to think wrong of binary files. Some people think that, if the concept of "binary file" exist, some other concepts should exist, such as "text files", and some people I new even fantasized about existing of "decimal files" or "hexadecimal files". It hardly could be more wrong than that. Essentially, all files are "binary". There are no "non-binary" files. When some say "binary file", it is used not as any strongly defined notion, but to conduct the idea that string representation of numbers is not used. Typically, "binary file" means "not very well readable by a human using just a text editor". Such files are written and read by some programs other than customary text editors.

In other words, "hexadecimal numbers in binary file" is absurd. If the file is decided to be "binary", you are supposed to write numbers bit-by-bit, not converting them to characters/strings.

Let's consider one example. If you use 32-bit int type, all object of this type occupy exactly 32 bit. Take some number, for example 1234567890. In memory, it will fill the following bits: 01001001100101100000001011010010
(don't mix it up with string "01001001100101100000001011010010", understand it as bits in memory, least significant bit on the right, most significant bit on left). Its hexadecimal presentation will be "499602d2". If you consider this presentation as sting, it will be the bytes equals to 50 ('2') then 100('d') or 64 ('D') then 50 again, and so on. Note that the number of characters required for string representation of number may depend on the value, not only on type: 'd' takes three characters and '2' takes two. Exactly two hexadecimal digits per bytes are used, which takes 2 times more room in 1-byte-per-character encodings. In other words, string representation of numbers takes considerably more memory.

Representation of characters have been defined long time ago by ASCII standard, which is now mapped to a part of Unicode: http://en.wikipedia.org/wiki/ASCII.

Note that text files can use ASCII of UTF-8 encodings using 2 bytes per byte representing a hexadecimal digit, and up to 3 bytes per decimal digit, but it UTF-16 is used (internal representation of all characters in memory used by .NET is UTF-16LE), all that digits characters occupy 2 bytes per character, which makes it to use up to 6 bytes per byte represented as string.

See also:
http://en.wikipedia.org/wiki/Unicode,
http://www.unicode.org,
https://msdn.microsoft.com/en-us/library/9b1s4yhz%28v=vs.90%29.aspx.

To search the number in a file, you need to know the size of the number. You need to convert number in a sequence of bytes and then search for this sequence in the file. You need to understand that those bytes are not the code points of decimal or hexadecimal digits; they are actual bytes of the binary representation of the number. You can perform this serialization using the class System.BitConverter:
https://msdn.microsoft.com/en-us/library/system.bitconverter%28v=vs.110%29.aspx[^].

See also my past numbers on binary I/O: vb.net binary file handling.

—SA
 
Share this answer
 
v4

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900