I like virtual machines. I use them very often to run suspicions soft, to run different OS or to isolate programs from each other. One day I wanted to write a program which must be run inside VM and must communicate with software which is placed outside VM. I started to study possible ways to do this and find the following methods:
- Network connection. It is the most obvious way.
- Via files in shared folders.
- There is a special API for VMWare.
But I was not satisfied with any of these methods:
The 1st and the 2nd methods could be easily detected by rootkits and anti-viruses. Of course, if you know what to search for, you can detect many other things, but network activity and file I/O are the most obvious things to check.
The 3rd method is VM-dependent.
Also the 2nd and the 3rd methods require guest additions to be installed, however this was not critical for me.
And I wanted something which is very hard or even impossible to detect and which is not tied to any specific VM-product. I hadn’t found ready-made solutions and decided to do it myself.
How This Problem Could be Solved
So my goal was to create an undetectable communication engine between host and guest, which is not tied to any concrete VM. Also I add that I develop for Windows, and I will talk about implementation for this family of operating systems. But I think that this idea will work for other OSes as well.
How can this goal be achieved? The first thing which came into my mind was the idea that I can read and write VM’s memory. I can just reserve a region of memory inside the virtual machine and perform communication via this region. Inside the guest, everything is rather simple – I just need to reserve memory region. The most difficult thing is to arrange access to this memory region from host and make proper synchronization. And if you want to successfully implement such method, you should handle several technical issues which I describe in this article.
What Should You Know to Successfully Solve the Problem?
There are several things which you should know to understand my solution to the problem. Here is the list:
- Knowledge of C/C++
- General knowledge of WinAPI and DLL injection
- How does virtual memory work
- Understanding of multi-threaded programming and synchronization
There is plenty of information on the internet on the above mentioned topics. In addition, I can recommend the book “Windows via C/C++” by Jeffrey Richter and Christophe Nasarre.
When you become familiar with all these things, we can move to the first technical issue:
1. Where Can You Find VM’s Memory and How Can You Read it from the Host OS?
From the processes point of view, structure of many desktop VM solutions (VMWare workstation, VirtualBox, VirtualPC, Parallels Desktop) is as follows:
- There are one or more GUI and service processes
- For each running virtual machine, there is exactly one process which performs VM execution and stores its physical memory. This process is called “vmware-vmx.exe” for VMWare, “VirtualBox.exe” for Virtual Box, “Virtual PC.exe” for VirtualPC and “prl_vm_app.exe” for Parallels Workstation. In this article, I will call this process as “VM execution process”.
For example, here is a screenshot of task manager with 2 running VMWare instances:
“vmware.exe” and “vmware-tray.exe” are service processes which are responsible for GUI and some other auxiliary things. And two “vmware-vmx.exe” instances are processes which actually perform execution of virtual machines.
And here is another obstacle…
2. How Can You Find Memory Region of Interest Inside the VM Execution Process?
Of course, VM physical memory is stored inside one process, but its internal structure is completely unknown. That’s why our program inside VM should not only reserve memory for communication, but it should also mark reserved memory with some signature.
Here I want to emphasize one important aspect of virtual memory. Virtual memory of each process is divided into pages (in 32bit Windows page size is 4096 bytes). And if pages have consecutive virtual addresses, it is not guaranteed that they lie consecutively in the physical memory. Moreover we do not know how VM execution process arranges physical memory of virtual memory of virtual machine. The following picture demonstrates the problem:
So we need to mark EACH page involved in communication with distinctive signature.
Another issue is the fact that guest OS can move pages or swap them into disk. We do not want this to happen, because in such case all our efforts in searching of pages location will be in vain. That’s why we need to lock page with help of
VirtualLock function before we start the communication.
So our program inside the VM should do the following:
- Reserve memory region which will be used for communication.
- Lock this memory region.
- Write some distinctive signature into each page of the region.
And program in the host should scan entire virtual memory of VM execution process to find pages marked with the signatures.
Here we come to the final question…
3. How Can We Read and Write Target Memory Region and How Can We Synchronize Guest and Host Read-write Operations?
The answer to the first part of the question is rather simple. There are two common ways to do it:
- We can use
WriteProcessMemory WinAPI functions.
- We can inject our own code into target process, so our injected code will be able to directly read and write memory.
Of course, if you implement everything I mentioned above, you’ll be able to arrange some kind of shared memory between host and guest. But it is not enough. You also need to implement some kind of synchronization primitives.
My method of synchronization is based on the assumption that each VM actually executes (not interpret!) code inside it (at least non-privileged instructions). This assumption is valid for the above mentioned VMs (VMWare, VirtualBox, …). Another idea is that some CPU instructions executes atomically even on multi-CPU cores. In particular, there is standard C function “
InterlockedCompareExchange” which is based on “lock xchg” instruction. And “lock xchg” performs atomic compare exchange of CPU register and memory cell. And this “lock xchg” instruction is executed natively by virtual machine.
I did not write assembler code, but just used
InterlockedCompareExchange function to create a simple mutex:
bool LockMutex(DWORD* mutexAddr)
DWORD startTime = GetTickCount();
result = InterlockedCompareExchange((LONG*)mutexAddr,
DWORD curTime = GetTickCount();
while (result != MUTEX_STATE_FREE && (curTime - startTime < MAX_WAIT_TIME))
if (sleepTime > startTime + maxWaitTime - curTime)
sleepTime = startTime + maxWaitTime - curTime;
result = InterlockedCompareExchange((LONG*)mutexAddr,
curTime = GetTickCount();
if (result == MUTEX_STATE_FREE)
bool UnlockMutex(DWORD* mutexAddr)
MUTEX_STATE_FREE, MUTEX_STATE_LOCK) == MUTEX_STATE_LOCK;
A very important aspect of such mutex implementation is that the code which locks and unlocks the mutex must be executed in the address space of virtual machine. That’s why you have to use DLL injection to access the memory of virtual machine.
Scheme of Communication Engine
Combining all the above, we obtain the following scheme of communication engine:
- Program inside guest OS should do the following:
- Allocate and lock memory region
- Put special signature into each page which will be used for communication
- Waiting for connection from host
InterlockedCompareExchange-based mutexes for synchronization
- Program inside host OS should do the following:
- Find target VM process
- Inject code which will perform actual communication into target VM process
- Injected code should periodically scan VM memory for signatures of pages which will be involved in communication
- When these pages will be found, communication with guest could be performed, and
InterlockedCompareExchange-based mutexes should be used for synchronization.
In this article, I described the main ideas behind the communication engine. In addition, I’m going to write several more articles which will cover implementation details with source code of the solution.
To be continued…