Introduction
This happened to me the other day. I was fixing an app that was implemented
as a horribly tangled ball of pointers. The app is processing through a linked
list when it falls over on bad data pointed to from within one of the nodes. The
type of object that we were supposed to point to was simply the virtual base
class for half the objects in the system. The first question I asked was, was
there ever an object there? The pointer it fell on was divisible by 4 and not
NULL, so it could have been a valid pointer at one time. Investigation with
Visual Studio's memory viewer (view->Debug Windows->Memory) showed that
the data pointed to by this pointer was filled with FE EE FE EE FE
EE... This usually indicates memory that was allocated, but now is not.
Something, somewhere deallocated my data. I needed a way figure out what
happened to my data.
Background
I ultimately ended up finding my lost data by overloading the new and delete operators. When a function is
called, after the arguments are pushed onto the stack, the return address gets
pushed on to the stack. We can then extract this from the stack in the
new and delete operators to help in debugging.
Using the code
After several wrong guesses about where my pointer went, I resorted to
overloading the operators new and delete
as shown below. This implementation of the new operator extracts
the return address from the stack. The return address is found between the
address of the function argument and the address of the first automatic
variable. The compiler settings, calling conventions, and machine architecture
may effect where the return address is found, so you may need to tweak this
slightly for your environment. Once it has its return address, new
allocates an extra sixteen bytes and stores the return address and the intended
size of the buffer at the front of the buffer, and returns a pointer to the
sixteenth byte into the buffer.
The delete operator, as you can see, no longer
deletes. Instead, it extracts the return address in the same fashion, pastes it
into the front of the buffer after the size, writes DE AD BE EF
into the last four bytes, then fills out the rest of the buffer with a repeating
pattern.
Now, when the app falls over in the debugger on the bad pointer, I simply
open the memory window, find where my pointer points, and go back 16 bytes. The
first four bytes are where new was called from. The
next four bytes are the allocated size. The third group of four bytes are where
delete was called from. The last group of four bytes should say
DE AD BE EF. Followed by the rest of the allocated buffer filled in
with 77 77 77 77.
To map these return addresses for new and
delete back to points in the source code, first reverse the byte
order of them. This is necessary because of Intel's backward-endianness. Next,
right click on the source and select Go To Disassembly. The leftmost column
contains the memory address of each machine instruction. Press Ctrl-G or select
(Edit->Go To...) and type in one of your extracted addresses. It then should
scroll you to the call to new or delete. To get back
to the source file, right click again, and select Go To Source. You should then
see a call to new or delete.
Now you can quickly figure out where your lost data went. As for figuring out
why delete was called on your data when you
still needed it, well, you're on your own.
#include <MALLOC.H>
void * ::operator new(size_t size)
{
int stackVar;
unsigned long stackVarAddr = (unsigned long)&stackVar;
unsigned long argAddr = (unsigned long)&size;
void ** retAddrAddr = (void **)(stackVarAddr/2 + argAddr/2 + 2);
void * retAddr = * retAddrAddr;
unsigned char *retBuffer = (unsigned char*)malloc(size + 16);
memset(retBuffer, 0, 16);
memcpy(retBuffer, &retAddr, sizeof(retAddr));
memcpy(retBuffer + 4, &size, sizeof(size));
return retBuffer + 16;
}
void ::operator delete(void *buf)
{
int stackVar;
if(!buf)
return;
unsigned long stackVarAddr = (unsigned long)&stackVar;
unsigned long argAddr = (unsigned long)&buf;
void ** retAddrAddr = (void **)(stackVarAddr/2 + argAddr/2 + 2);
void * retAddr = * retAddrAddr;
unsigned char* buf2 = (unsigned char*)buf;
buf2 -= 8;
memcpy(buf2, &retAddr, sizeof(retAddr));
size_t size;
buf2 -= 4;
memcpy(&size, buf2, sizeof(buf2));
buf2 += 8;
buf2[0] = 0xde;
buf2[1] = 0xad;
buf2[2] = 0xbe;
buf2[3] = 0xef;
buf2 += 4;
memset(buf2, 0x7777, size);
}
Points of Interest
This code can also be used to detect memory leaks. Simply fix the delete operator so that it actually deallocates the memory.
Then just before the app exits, use _heapwalk to step through the
allocated buffers and extract the addresses where new was called.
This will give you a list of calls to new which have not been
matched with a call to delete.
Also, this code is for debugging, if you put this code into a production app,
it will run out of memory fairly quickly.