Click here to Skip to main content
15,886,701 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
I want to decompile .dll written in c language .
Any tool available for this purpose?
Posted
Comments
enhzflep 24-Oct-13 7:42am    
There is at least one tool I can think of, however it's rather expensive - Hex-Rays' IDA Pro.
https://hex-rays.com/products/decompiler/ The decompiler used to be a plugin for IDA, last time I used it, it was included. Now, I'm unsure.

Anything that will decompile an exe would, I expect, also handle a dll.
pasztorpisti 25-Oct-13 6:09am    
To my knowledge that is the only suitable tool. OP should know that decompiling an optimized binary with lots of inlined functions can look much different than the original source (even if it does the same). But at least it will be much more readable than assembly listing.
enhzflep 25-Oct-13 6:22am    
Thanks, I figured it likely was. There's got to be a reason for it's expense and the total lack of expense (i.e FOSS or free-ware) others have.

Great points you make regarding optimized binaries and particularly, inlined functions. My virtual vote of +5. The discussion would certainly be incomplete without their addition. Arguably, those points are more important that which tool to try to use. (certainly harder to uncover in a google search than HexRays is) :two thumbs-up:
pasztorpisti 25-Oct-13 6:26am    
Thank you! Lets share the virtual five, the tip is yours after all! :-)

EDIT: And we forgot to mention that a 3rd party module can have antidebug/compressor protection so the user may have to use IDA to fight against these before decompiling...
enhzflep 25-Oct-13 6:39am    
Done! It's a deal. You take 3, I'll have 2.
Any one feel like Armadillo or nanomites? :laugh:

I'm off home. I'm as spent as an amusement-park token.

1 solution

It's not possible to decompile a binary. You can however use the binutils objdump in the GCC to disassemble it.

objdump -D yourdll.dll > code.s
 
Share this answer
 
v2
Comments
enhzflep 24-Oct-13 7:35am    
Nonsense!
Ghosuwa Wogomon 24-Oct-13 13:01pm    
no, it's not, for several reasons.

1. In assembly, arguments are pushed onto the stack by address to be passed to functions. There is no way to detect the type, size, or name of any of these arguments.

2. The above also applies to any variable in general. You have no way of knowing of esi is an integer, string, struct, function pointer, etc.

3. Even if you could get 1 and 2 out of the way (which you can't), compiler optimization would then get in your way. eg.
int i = 0, j = 0;
might be broken down to
xor ebx, ebx
xor ecx, ecx
locking 'i' and 'j' to 'ebx' and 'ecx' because they're commonly used, and there'd be no way for a decompiler to know if those are actual variables. another eg.
for (int i = 0; i < 3; i++)
j++;
Since this loop only has 3 cycles, the compiler might unroll, so instead of
mov ebp, 3
xor ebx, ebx
.forLoop:
inc ebx
cmp ebx, 3
je .endForLoop
dec ecx
jmp .forLoop
.endForLoop
it would become
add ebx, 3
add ecx, -3
There would be no way at all of knowing that the above is part of a for loop.

-----------

So, I reiterate, decompiling = impossible; disassembling = possible.
enhzflep 24-Oct-13 13:56pm    
I take it you've not used the tool I left a link to earlier?
I understand and am conversant with all of the points you mention. In fact, I learnt a huge amount 20 years ago only equipped with TASM(turbo assembler), TD(turbo debugger), Sourcer(disassembler) and anything my friend would bring me that he'd downloaded from his BBS.

1. You know the address of a variable that's been passed onto the stack. You can then examine the way that said address is used and from that, infer the type of data. It's not bomb-proof, but it works in the stunning majority of cases I've encountered. The remainder of cases simply need the decomp engine to be provided with a few hints - hints you can give it from reading the disasm.

2. See answer #1

3. If you unroll a loop, you repeat the same instructions N times, where N is the level of loop unrolling. A compiler has no trouble unrolling it. I'm curious as to why you think that a person/program couldn't re-roll a loop.


Anyway, here's a trivial example, using MinGW g++ 4.7.1 - Decompiled using Hex-Rays IDA Pro v 5.5.0.925t (32-bit) 2009 edition.

IDA was provided with precisely 2 pieces of information by me.
1) The DLL file
2) The fact that it was to analyse a DLL (rather than a PE executable, .ocx PE ActiveX control, PE/LE/NE Device Driver, COFF/OMF object file, COFF/OMF Static Library)

Furthermore, the DLL was compiled in release-mode, with the compiler option "Strip all symbols from binary (minimizes size) [-s]" checked.

--------------------------------------------------------------------------------------------
Input
--------------------------------------------------------------------------------------------
#include "main.h"
#include <stdio.h>

// a sample exported function
void DLL_EXPORT SomeFunction(const LPCSTR sometext)
{
MessageBoxA(0, sometext, "DLL Message", MB_OK | MB_ICONINFORMATION);
}

extern "C" DLL_EXPORT BOOL APIENTRY DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved)
{
switch (fdwReason)
{
case DLL_PROCESS_ATTACH:
// attach to process
// return FALSE to fail DLL load
break;

case DLL_PROCESS_DETACH:
// detach from process
break;

case DLL_THREAD_ATTACH:
// attach to thread
break;

case DLL_THREAD_DETACH:
// detach from thread
break;
}
return TRUE; // succesful
}

extern "C" DLL_EXPORT void ShowMouseLoc(HDC hdc, LPARAM lParam)
{
char str[80];

wsprintf(str, "Button is down at %d, %d",
LOWORD(lParam), HIWORD(lParam));
TextOut(hdc, LOWORD(lParam), HIWORD(lParam),
str, strlen(str));
}

extern "C" DLL_EXPORT void ShowMouseRoll(HDC hdc, WPARAM wParam)
{
char str[80];
signed short Roll;

Roll = HIWORD(wParam);

sprintf(str, "Mouse Rolled: %4i", Roll);
TextOut(hdc, 0, 0, str, strlen(str));
}

--------------------------------------------------------------------------------------------
Output
--------------------------------------------------------------------------------------------
signed int __stdcall DllMain(int a1, int a2, int a3)
{
return 1;
}

BOOL __cdecl ShowMouseRoll(HDC a1, int a2)
{
char v3; // [sp+20h] [bp-5Ch]@1

sprintf(&v3, "Mouse Rolled: %4i", SHIWORD(a2));
return TextOutA(a1, 0, 0, &v3, strlen(&v3) - 1);
}

BOOL __cdecl ShowMouseLoc(HDC a1, unsigned int a2)
{
char v3; // [sp+20h] [bp-6Ch]@1

wsprintfA(&v3, "Button is down at %d, %d", (unsigned __int16)a2, a2 >> 16);
return TextOutA(a1, (unsigned __int16)a2, a2 >> 16, &v3, strlen(&v3) - 1);
}

int __cdecl SomeFunction(const CHAR *a1)
{
return MessageBoxA(0, a1, "DLL Message", 0x40u);
}


Perhaps you and I have different definitions of decompile? certainly, the code doesn't retain all of the features of the initial source-code. A perfect case-in-point is DllMain.
Ghosuwa Wogomon 24-Oct-13 18:10pm    
Looks like a fail to me. None of the functions have the correct return type, ShowMouseRoll and ShowMouseLoc are sprintf'ing to a char which will throw an memory exception error, and DLLMain is just returning 1 instead of performing the conditional checking of the switch/case.

All of those were extremely simple functions, and yet your decompiler output complete trash that not only fails to do what the functions were originally intended to do, but will crash as well.

So I stick to my statement, decompiling is not possible.
enhzflep 24-Oct-13 19:41pm    
Indeed, the output is not a perfect facsimile of the original input - I never claimed it was. I'm not a perfect driver on the roads either, but that doesn't mean that I can't drive.

I suggest you look more closely at the DLLMain functions. Notice how in this example, the original doesn't actually do anything? So in essence, they are the same. The disassembly of each is identical too. Trivial examples almost always fail to properly indicate performance on real-world examples.

Any reasonable bounds-checker will catch the fact that we're trying to sprintf to a single char, rather than an array of them - so this mistake will be caught by all but the least competent of programmers.

A quick check of the disassembly of both the original version and the decompiled version shows them to be almost identical. After changing char v3; to char v3[80]; the disassembly is identical. As you're no doubt aware, the return value is found in EAX. So, whether or not the program uses this value, it is still actually returned. Syntactically perfect? Nope. Binary identical? Yup.

As for the return-type of the functions, don't forget that sizeof(BOOL) == sizeof(int), the fact that we're putting 1 into an int, rather than true into a BOOL, doesn't change the binary output. It does muddy the water somewhat, but is a perfectly functional equivalent - in cases, it's actually better.

Take for example, DIALOGPROC - the window procedure for a dialog box. The common wisdom (& MS stipulation) is to make this function return a BOOL. It doesn't need to. If it returns an LRESULT, you actually get more functionality, not a reduction or an error in execution. A perfect example is handling a WM_CTLCOLORSTATIC message. Looking at the docs, one sees that handling the message should result in the function returning a HBRUSH. Yet, DIALOGPROC 'should' return a BOOL.
This appears to indicates that handling this message should not be possible. (And indeed the compiler moans bitterly if you try to return a HBRUSH from a function marked as returning a BOOL) However, if one instead tells the compiler that DIALOGPROC returns an LRESULT, then guess what? It works as is expected - windows uses the returned HBRUSH to do the painting.

Anyway, thanks for the discussion. Ten years ago, my answer may have been more closely aligned with yours. However, some years of reverse-engineering experience has taught me that (!perfect) != (!possible).
As I said earlier, I think the largest difference in our positions comes from the working definition that we each have of "decompile". I can't throw a perfect 180 in darts, but I can still play successfully..

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900