Continuing our antidebugging study, in this article, we will examine one of the modern antidebugging methods that is based on nanomites. It’s also an effective method of the process antidumping.
This approach was first introduced in the Armadillo protector for Windows applications. In this article, we will consider its possible improvement and implementation in the Linux application code protection solution.
The Armadillo Protector
Armadillo (also known as SoftwarePassport) is a commercial protector with a rich development history. It was first released on January 15, 1999. It was developed by Silicon Realms company with Chad Nelson as the leading code developer.
One of the methods used by Armadillo is nanomites.
The Description of the Nanomites Implementation in Armadillo
Armadillo uses the protection by parent process. In Windows OS, only one
ring-3 debugger can be attached to a debugged process at a time. This was used in Armadillo. The protection creates a child process and attaches to it with debugging. When a protected program is started in a debugger, it is possible to debug only the parent process. This technique is called Debug Blocker.
The developer of the application, which is protected by Armadillo, marks some code segments in the program sources. The compiler leaves these marks untouched after a compilation. It allows the protector to locate the code that requires protection. After that, the code is cut out from the program for packing. During unpacking, the cut out code segments are obfuscated and written into the allocated memory with jumps to them left in their place.
The marks from the program are used to identify the procedures that must be “processed”. The processing is creating the table of conditional and unconditional jumps. These jumps are replaced with 0xCC byte before packing (the int3 assembler command), and other jump bytes are replaced with trash. The obstruction to restoring the table is its “completeness”: before packing, all
0xCC bytes are also included in the table with non-existent jumps assigned to them. For example, there is a push
0CCh instruction, but the address of the
0CCh byte is present in the table as well, and on attempt to restore this table element, the code will be completely corrupted.
While executing the code, debugging exceptions constantly occur at nanomites (
int3 is basically a breakpoint). However the parent process doesn’t restore the original bytes at all. It examines three tables: the table of flags, the table of sizes, and the table of offsets. The analysis of flags defines where the control must be transferred: a size value (no jump) or a size value + an offset value (a jump is needed).
This method is the most powerful weapon of the protector as there is no means of fighting nanomites that would be 100% reliable.
There are three main ways of fighting nanomites.
The first of them is repeated attaching of a debugging process, or enabling VEH followed by the inclusion of the table of nanomites to the debugger code. Basically the protection debugger is replaced with the cracker’s debugger. Researching such program isn’t the most pleasant work as trash bytes and the absence of jumps break the work of almost any analyzer.
The second way is starting the program under an unpacker or under a debugger embedded in an unpacked program. While the program is running, hacker must try to use as many program features as possible and try out different combinations of program options. This way, hacker can gradually collect the information on “real” nanomites and restore them. As it is unlikely that all nanomites will be restored this way, this method is combined with the first one.
The third way, which is the most difficult to implement, is an analyzer. An analyzer must "execute” the code and go through all its branches, and when it encounters nanomites, it must restore them according to the table. The complication here is the processing of switch-tables, table calls (class method calls and high-level language constructor calls look like
call ds:[edx+410h] at a low level, for example), and API functions with
The Difficulties of Reversing a Product Protected by Nanomites
- Debug Blocker approach like the one in Armadillo. We cannot attach with a debugger to a process which is under the
ring3 debugging. The only option is a
ring 0 debugging, and this isn’t very convenient. The detaching of the parent process is possible only after restoring all nanomites. The result of an attempt to attach with a debugger to a process protected by nanomites:
- Due to the absence of jumps in the software, the application transforms into a solid piece of code in disassemblers (e.g. IDA Pro), and this prevents any analysis. The code looks as follows:
- The major disadvantage is performance. Each time a process encounters a nanomite, a context is switched and the control is transferred to the parent process. This procedure is very slow, that’s why it is best to use nanomites in the key code segments that are not critical for performance.
- This mechanism may conflict with other antidebugging methods.
Corrections of the make files of the protected project are required for the implementation of this mechanism, and this is not always convenient.
Linux Project Description
The project was written for Linux OS 32-bit applications. But the principles can easily be implemented for other operating systems, so further development is planned.
First, we will take a look at creating a custom debugger for Linux. After that, we will move on to the implementation of nanomites. Binutils and Perl are used for the compilation of the project.
We apply the combination of two techniques: Nanomites and Debug Blocker.
Linux Code Protection includes 2 main components:
- Nanomites: a static library that contains the debugger process logic.
- Nanomites Debugger: a debugger executable file, it is compiled with the Nanomites library.
There’s also a script collection for adding the nanomites to an application and for creating nanomites tables.
Protected Application Creation Sequence
An application with an
–S key for creating an assembler listing is created;
The assembler listing is analyzed with Perl script. All jump and call instructions (e.g.,
call, etc.) are processed and replaced with
instructionOffsetLabel(N): int 3;
After that, the user application, which consists of modified assembler listings, is compiled.
With the help of a Perl script, a compiled application is parsed and the table of nanomites is built.
Debugger Library Description
Our debugger is based on the
ptrace (process trace) system call, which exists in some Unix-like systems (including Linux, FreeBSD, Mac OS X). It allows tracing or debugging the selected process. We can say that ptrace provides the full control over a process: we may change the application execution flow, display and change values in memory or registry states. It should be mentioned that it provides us no additional permissions: possible actions are limited by the permissions of a started process. Moreover, when a program with setuid bit is traced, this bit doesn’t work as the privileges are not escalated.
After the demo application is processed with scripts, it is not independent anymore, and if it is started without a debugger, the «segmentation fault» appears at once. The debugger starts the demo application from now on. For this purpose, a child process is created in the debugger, and then parent process attaches to it. All debugging events from the child process are processed in a cycle. It includes all jump events; parent process analyzes nanomite table and flag table to perform correct action.
The Advantages of the Solution Compared to Armadillo
In Armadillo, the binary code is modified. That’s why when a 2-5 bytes long jump instruction is replaced with a shorter 1 byte long
int 3 (0xcc) instruction, some free space remains. Correspondingly, we need to write the original jump instruction over
int 3 to restore a nanomite.
We change the code on the sources level in our approach. That’s why the nanomite will be 1 byte long. Correspondingly, we won’t be able to restore the nanomite by writing the original instruction over it. And we cannot extend the code in the place of the nanomite as all relative jumps would be broken. But there is a way to restore our nanomites, for example the following.
A Way to Recover Linux Nanomites
A hacker can create an additional section in the executable file, then find the nanomite and obtain its jump instruction and jump address.
Then the restoration goes as follows:
Such solution is complex in implementation. Firstly, a disassembler engine is required for automation, secondly, the moved instructions may contain jump instructions with relative jumps, which will require corrections.
Learn more about Linux Code Protection project at its official page.