Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

The virtual bool bug

0.00/5 (No votes)
2 Sep 2003 1  
Describes the virtual bool bug that exists in mixed mode Managed C++ programs that access unmanaged classes

Introduction

This bug was first reported by Jochen Kalmbach on April 12th 2002 (no links available to original posting), when VS.NET 7.0 was doing its initial rounds; and it's quite inconceivable why the bug still exists in VS.NET 2003. Just about every week, at least two people report issues related to this bug and I thought it might be a good idea to have an article on it here on CodeProject. What's really annoying is that the developer might spend several hours or even a full day on the problem before realizing that it is not a problem with his code.

The bug

The most common scenario where the bug is reported is when someone has a mixed mode C++ program that has a managed class, which accesses an unmanaged class in an unmanaged DLL. Now if the unmanaged class has a virtual function that returns a bool, then irrespective of what value it returns, the managed caller *always* gets back true. But it's not necessary for the code to be in two separate entities (the EXE and the DLL), the bug occurs if the unmanaged class is defined in a #pragma unmanaged block in a mixed mode EXE or DLL.

Minimal code to reproduce bug

#pragma unmanaged
class Unmanaged
{
public:
    virtual bool IsAlive()
    {
        return false;
    }
};

#pragma managed
__gc class Managed
{
public:
    void Test()
    {
        Unmanaged* um = new Unmanaged();
        if(um->IsAlive())
        {
            //Always executes

            Console::WriteLine("Function returned true. BUG!!!");
        }
        else
        {
            //Never executes

            Console::WriteLine("Function returned false. No Bug :-)");
        }
        delete um;
    }
};

int _tmain()
{
    Managed* mg = new Managed();
    mg->Test();
    return 0;
}

Trying to figure it out

Let's examine the disassembly for the IsAlive function :-

;virtual bool IsAlive()

004010B0 push ebp 
004010B1 mov ebp,esp 
004010B3 push ecx 
004010B4 mov dword ptr [ebp-4],ecx 

;return false

004010B7 xor al,al ; Notice how AL is made 0 (false)

004010B9 mov esp,ebp 
004010BB pop ebp 
004010BC ret 

As you can see, the result of the function is returned in the AL register and this is what the contents of my registers looked like at this point :-

EAX = 00401000 EBX = 0012EFB4 ECX = 06C42C88 EDX = 00425410 
ESI = 00168930 EDI = 00000000 EIP = 004010B9 ESP = 0012EFA8 
EBP = 0012EFAC EFL = 00000246

Now let's see the disassembly for the caller code :-

;if(um->IsAlive())

00000065 mov eax,dword ptr [ebp-18h] 
00000068 mov eax,dword ptr [eax] 
0000006a mov esi,dword ptr [eax] 
0000006c mov ecx,dword ptr [ebp-18h] 
0000006f mov eax,esi 
00000071 push 1692D0h 
00000076 call F9759F50 ; The call to the function

0000007b movzx esi,al ; Copying the return value to ESI

0000007e test esi,esi ; Checking for true 

00000080 je 0000009A ; If false then jump to 9A  

The return value is obtained from the AL register. Let's see the contents of the registers now :-

EAX = 00000001 EBX = 0012F0C8 ECX = 00000004 EDX = 00000000 
ESI = 00000001 EDI = 04A719C8 EBP = 0012F070 ESP = 0012F044 

Horror of horrors! AL is now 1 (more precisely EAX has been set to 1). I had stepped through the disassembly and AL was 0 at the time the RET instruction was executed; therefore the register corruption must have occurred during the managed-unmanaged transition.

Workarounds

The simple workaround is to use a BOOL (typedef for an int) instead of a bool.

class Unmanaged
{
public:
    virtual int IsAlive()
    {
        return false;
    }
};

The casting is implicit from am int to a bool and so we don't really have to do anything extra.

A slightly bizarre looking workaround [see section titled "More info" for heheh more info] suggested by someone (possibly Microsoft Support) is to set EAX to a value under 255 before returning from the unmanaged function.

class Unmanaged
{
public:
    virtual bool IsAlive()
    {
        __asm mov eax,100
        return false;
    }
};

More info

I got some more information regarding this issue from Tom Archer (my friend, fellow-CPian and co-author) who got this information from a friend of his, who is in the VC++ compiler team. It seems this bug occurs when one of the upper 24 bits of the EAX register is non-zero. They have a hot-fix for this bug for both VC++.NET 7 and for VC++.NET Everett, but it might be a better idea to wait for the next service pack.

Still more info (Thanks Jochen)

Jochen's post gave me a few links which provided even more info on this bug. The bug occurs due to the way the CLR marshals boolean values. The CLR thinks that a boolean is 4 bytes (as it is under .NET) but the C++ bool type is only a single byte (so much for efficiency and the hassles it brings about). What happens during marshalling is that the CLR examines the higher three bytes and if they contain any data, it assumes that the boolean value being passed is true. As far as I understood from the postings made by MS support, there was a sort of vague argument between the CLR team and the VC++ compiler team. The VC++ compiler team believed (and rightly so in my opinion) that the issue was with the CLR's marshalling code, but it seems the CLR team wanted the VC++ team to emit a custom MarshalAs attribute for the method that returns a bool. But obviously you cannot apply .NET attributes to an unmanaged function, as methods compiled as unmanaged don't appear in the meta-data. Anyway, now we know why its so important to clear the upper 3 bytes of the EAX register.

Related Microsoft KB links

Conclusion

What's really dangerous about this bug is that it's quite easy not to see it, because most functions that return bool return false to indicate an error, and thus by getting true all the time, we never realize that there is anything amiss. Thus it's quite easy to miss the bug until it's really late into the software development cycle. I have been working with mixed mode programs for quite a while now, specially since I began my book with Tom (Extending MFC Applications With the .NET Framework) and this is an issue in which I am quite naturally interested; and I would like to hear more intelligent analysis than mine from some of the gurus that frequent CP.

History

  • Aug 27 2003 - First published
  • Aug 30 2003 - Updated with more info and related KB link
  • Sep 03 2003 - Updated with more info provided by Jochen

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here