Click here to Skip to main content
Click here to Skip to main content

The virtual bool bug

By , 2 Sep 2003
 

Introduction

This bug was first reported by Jochen Kalmbach on April 12th 2002 (no links available to original posting), when VS.NET 7.0 was doing its initial rounds; and it's quite inconceivable why the bug still exists in VS.NET 2003. Just about every week, at least two people report issues related to this bug and I thought it might be a good idea to have an article on it here on CodeProject. What's really annoying is that the developer might spend several hours or even a full day on the problem before realizing that it is not a problem with his code.

The bug

The most common scenario where the bug is reported is when someone has a mixed mode C++ program that has a managed class, which accesses an unmanaged class in an unmanaged DLL. Now if the unmanaged class has a virtual function that returns a bool, then irrespective of what value it returns, the managed caller *always* gets back true. But it's not necessary for the code to be in two separate entities (the EXE and the DLL), the bug occurs if the unmanaged class is defined in a #pragma unmanaged block in a mixed mode EXE or DLL.

Minimal code to reproduce bug

#pragma unmanaged
class Unmanaged
{
public:
    virtual bool IsAlive()
    {
        return false;
    }
};

#pragma managed
__gc class Managed
{
public:
    void Test()
    {
        Unmanaged* um = new Unmanaged();
        if(um->IsAlive())
        {
            //Always executes
            Console::WriteLine("Function returned true. BUG!!!");
        }
        else
        {
            //Never executes
            Console::WriteLine("Function returned false. No Bug :-)");
        }
        delete um;
    }
};

int _tmain()
{
    Managed* mg = new Managed();
    mg->Test();
    return 0;
}

Trying to figure it out

Let's examine the disassembly for the IsAlive function :-

;virtual bool IsAlive()
004010B0 push ebp 
004010B1 mov ebp,esp 
004010B3 push ecx 
004010B4 mov dword ptr [ebp-4],ecx 

;return false
004010B7 xor al,al ; Notice how AL is made 0 (false)
004010B9 mov esp,ebp 
004010BB pop ebp 
004010BC ret 

As you can see, the result of the function is returned in the AL register and this is what the contents of my registers looked like at this point :-

EAX = 00401000 EBX = 0012EFB4 ECX = 06C42C88 EDX = 00425410 
ESI = 00168930 EDI = 00000000 EIP = 004010B9 ESP = 0012EFA8 
EBP = 0012EFAC EFL = 00000246

Now let's see the disassembly for the caller code :-

;if(um->IsAlive())
00000065 mov eax,dword ptr [ebp-18h] 
00000068 mov eax,dword ptr [eax] 
0000006a mov esi,dword ptr [eax] 
0000006c mov ecx,dword ptr [ebp-18h] 
0000006f mov eax,esi 
00000071 push 1692D0h 
00000076 call F9759F50 ; The call to the function
0000007b movzx esi,al ; Copying the return value to ESI
0000007e test esi,esi ; Checking for true 
00000080 je 0000009A ; If false then jump to 9A  

The return value is obtained from the AL register. Let's see the contents of the registers now :-

EAX = 00000001 EBX = 0012F0C8 ECX = 00000004 EDX = 00000000 
ESI = 00000001 EDI = 04A719C8 EBP = 0012F070 ESP = 0012F044 

Horror of horrors! AL is now 1 (more precisely EAX has been set to 1). I had stepped through the disassembly and AL was 0 at the time the RET instruction was executed; therefore the register corruption must have occurred during the managed-unmanaged transition.

Workarounds

The simple workaround is to use a BOOL (typedef for an int) instead of a bool.

class Unmanaged
{
public:
    virtual int IsAlive()
    {
        return false;
    }
};

The casting is implicit from am int to a bool and so we don't really have to do anything extra.

A slightly bizarre looking workaround [see section titled "More info" for heheh more info] suggested by someone (possibly Microsoft Support) is to set EAX to a value under 255 before returning from the unmanaged function.

class Unmanaged
{
public:
    virtual bool IsAlive()
    {
        __asm mov eax,100
        return false;
    }
};

More info

I got some more information regarding this issue from Tom Archer (my friend, fellow-CPian and co-author) who got this information from a friend of his, who is in the VC++ compiler team. It seems this bug occurs when one of the upper 24 bits of the EAX register is non-zero. They have a hot-fix for this bug for both VC++.NET 7 and for VC++.NET Everett, but it might be a better idea to wait for the next service pack.

Still more info (Thanks Jochen)

Jochen's post gave me a few links which provided even more info on this bug. The bug occurs due to the way the CLR marshals boolean values. The CLR thinks that a boolean is 4 bytes (as it is under .NET) but the C++ bool type is only a single byte (so much for efficiency and the hassles it brings about). What happens during marshalling is that the CLR examines the higher three bytes and if they contain any data, it assumes that the boolean value being passed is true. As far as I understood from the postings made by MS support, there was a sort of vague argument between the CLR team and the VC++ compiler team. The VC++ compiler team believed (and rightly so in my opinion) that the issue was with the CLR's marshalling code, but it seems the CLR team wanted the VC++ team to emit a custom MarshalAs attribute for the method that returns a bool. But obviously you cannot apply .NET attributes to an unmanaged function, as methods compiled as unmanaged don't appear in the meta-data. Anyway, now we know why its so important to clear the upper 3 bytes of the EAX register.

Related Microsoft KB links

Conclusion

What's really dangerous about this bug is that it's quite easy not to see it, because most functions that return bool return false to indicate an error, and thus by getting true all the time, we never realize that there is anything amiss. Thus it's quite easy to miss the bug until it's really late into the software development cycle. I have been working with mixed mode programs for quite a while now, specially since I began my book with Tom (Extending MFC Applications With the .NET Framework) and this is an issue in which I am quite naturally interested; and I would like to hear more intelligent analysis than mine from some of the gurus that frequent CP.

History

  • Aug 27 2003 - First published
  • Aug 30 2003 - Updated with more info and related KB link
  • Sep 03 2003 - Updated with more info provided by Jochen

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Nish Sivakumar

United States United States
Nish is a real nice guy who has been writing code since 1990 when he first got his hands on an 8088 with 640 KB RAM. Originally from sunny Trivandrum in India, he has been living in various places over the past few years and often thinks it’s time he settled down somewhere.
 
Nish has been a Microsoft Visual C++ MVP since October, 2002 - awfully nice of Microsoft, he thinks. He maintains an MVP tips and tricks web site - www.voidnish.com where you can find a consolidated list of his articles, writings and ideas on VC++, MFC, .NET and C++/CLI. Oh, and you might want to check out his blog on C++/CLI, MFC, .NET and a lot of other stuff - blog.voidnish.com.
 
Nish loves reading Science Fiction, P G Wodehouse and Agatha Christie, and also fancies himself to be a decent writer of sorts. He has authored a romantic comedy Summer Love and Some more Cricket as well as a programming book – Extending MFC applications with the .NET Framework.
 
Nish's latest book C++/CLI in Action published by Manning Publications is now available for purchase. You can read more about the book on his blog.
 
Despite his wife's attempts to get him into cooking, his best effort so far has been a badly done omelette. Some day, he hopes to be a good cook, and to cook a tasty dinner for his wife.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
NewsStill broken in Compact Framework 3.5 Pinmemberbtj-agilent29-Sep-09 7:45 
Just a heads-up in case anyone working with Compact Framework stumbles on this thread...
 
Turns out this defect fix didn't get propagated to Compact Framework. It is much more pervasive than what this article describes. Pretty much any unmanaged function returning an 8-bit bool (whether to C++/CLI or to C#) is at risk, even if virtual functions aren't involved.
 
See return: MarshalAs(UnmangedType.U1) is ignored for bool return type of a DllImport function on x86 CE 6.0[^]
 
My thanks to Nishant, this thread saved me a lot of time in isolating what turned out to be a very obscure defect. The nature of the defect meant that doing a Debug C# build or adding printf statements to the C++ code to try and narrow it down would cause the symptoms to disappear, so having a hint of what to look for really helped.
 
Byron

GeneralUnmanaged calling Unmanaged PinmemberJim99916-Aug-07 8:22 
I have hit the same issue with .Net 2003 where a function in an unmanaged class is calling a function in another unmanaged class. The second class is built as a static library, the first is an exe which also contains mixed mode classes. The fault, which causes fields to appear invalid and invokes web services unnecessarily only occurs in the debug build. Release works fine. As does debug now I have put the asm code in...
 
Thanks for the info.
QuestionVisual Studio 2005? Pinmemberptulou15-Aug-07 4:15 
Can anyone confirm for me that this bug still occurs in VS 2005? I have run up against EXACTLY these symtoms. All of the documentation indicates that this was a 2002/2003 bug and has been fixed in 2005... maybe it has re-emerged?
 
By the way, I was able to work around this by doing the following with PInvoke:
 

[DllImport("FDLib.dll", CharSet = CharSet.Auto, EntryPoint = "FD_SaveAlarm_WPM")]
[return: MarshalAs(UnmanagedType.U1)]
public static extern bool SaveAlarm_WPM();

 
The MarshalAs(UnmanagedType.U1) forces the return value to look only at the 1st byte that is used by the unmanaged bool. Is this proper, or is it going to cause me problems down the road?
Questionmanaged/unmanaged boundary Pinmemberensdorf13-Nov-06 12:34 
I observed this exact behavior in a mixed-mode application, but it was not a virtual function, nor was the call between managed and unmanaged code - it was a call from unmanaged to managed.   Essentially it occurred inside the std c++ sort function, where the custom predicate (my code) returned a value of false that was being interpreted as true.   I am wondering if anyone can confirm that this is a possible scenario in which this bug can occur.   Incidentally, applying vs 2003 sp1 did fix the issue.   Thanks!
 
-Ken

AnswerRe: managed/unmanaged boundary PinmemberPaul Tulou15-Aug-07 4:53 
GeneralStill fixed in VC2005-Express Beta 1 PinmemberJochen Kalmbach29-Jun-04 9:53 
Example code looks a little bit different (due to new syntax):
 
#pragma unmanaged
class Unmanaged
{
public:
virtual bool IsAlive()
{
return false;
}
};
 
#pragma managed
ref class Managed
{
public:
void Test()
{
Unmanaged *um = new Unmanaged();
if(um->IsAlive())
{
//Always executes
System::Console::WriteLine("Function returned true. BUG!!!");
}
else
{
//Never executes
System::Console::WriteLine("Function returned false. No Bug Smile | :) ");
}
delete um;
}
};
 
int _tmain()
{
Managed ^mg = gcnew Managed();
mg->Test();
 
return 0;
}
GeneralRe: Still fixed in VC2005-Express Beta 1 PinstaffNishant S5-Apr-05 21:22 
QuestionHow did this bug make it into a release? PinsussAnonymous16-Jun-04 16:36 
Does Microsoft even test their software? "Managed" C++ is completely unusable because of this (and other) marshalling bugs. I can't imagine how something this broken could have ever shipped, or how it could go on for YEARS afterward without being fixed.
AnswerRe: How did this bug make it into a release? PinstaffNishant S5-Apr-05 21:21 
GeneralFixed in Whidbey alpha.... PinmemberJochen Kalmbach16-Jan-04 8:00 
It seems that this big is fixed in Whidbey alpha (PDC-Version).
 
Greetings
Jochen
GeneralRe: Fixed in Whidbey alpha.... PinstaffNishant S5-Apr-05 21:20 
GeneralReminder: Avoid Mixed DLLs PinmemberRoy Muller4-Sep-03 5:29 
This bug explains the problems I saw when getting rid of our mixed mode DLLs -- these are DLL files that contain both managed and unmanaged code. The "Minimal code to reproduce bug" example above will generate a mixed mode dll.
 
See Mixed DLL Loading Bug for a description of the problem.
 
Thanks for shedding light on why bools had to be upcasted to an INT32 when marshalled across the managed/unmanaged boundary. FWIW, I don't think the 'virtual' keyword is required to cause this to happen.
 

 
-Roy
GeneralRe: Reminder: Avoid Mixed DLLs PinmemberJochen Kalmbach5-Sep-03 0:25 
GeneralRe: Reminder: Avoid Mixed DLLs PinmemberRoy Muller5-Sep-03 5:07 
GeneralI am not the first... PinmemberJochen Kalmbach3-Sep-03 4:55 
Hello Nishant S,
 
just a little correction...
I was not the first that found the bug... maybe I was the first that reported it "officially" to MS via MS-support...
 
See original first thread:
bool transition bug (proof and refinement) (2002-04-09 08:46:25)
 
Next posting:
bool transition bug (proof and refinement) - correction (2002-04-09 09:08:27)
 
bool transition bug (solution) (2002-04-10 01:19:30)
 
Greetings
Jochen

GeneralRe: I am not the first... PineditorNishant S3-Sep-03 5:04 
GeneralRe: I am not the first... PinmemberJochen Kalmbach3-Sep-03 6:54 
GeneralRe: I am not the first... PinmemberJochen Kalmbach3-Sep-03 6:55 
GeneralRe: I am not the first... PineditorNishant S3-Sep-03 15:28 
GeneralRe: I am not the first... PineditorNishant S3-Sep-03 5:06 
GeneralIsnt this the same as.... Pinmemberleppie30-Aug-03 2:55 
that compiler warning that goes along the lines of:
 
dont
 
BOOL res = FALSE;
return res; //bad
 
Do:
BOOL res = FALSE;
return (res) != 0; //good
 

 
leppie::AllocCPArticle("Zee blog");
GeneralRe: Isnt this the same as.... PinmemberGary R. Wheeler30-Aug-03 13:11 
GeneralA bit more detail PinmemberTom Archer29-Aug-03 11:21 
This is covered in a several KB articles including http://support.microsoft.com/default.aspx?kbid=823071[^].
 
In addition, I've asked a friend on the C++ compiler team what is going on here and he had the following to say that I thought I'd share with the group. (This also explains why the EAX workaround works.)
 
We fixed this as a hot fix for VC 2002, but because of the timing of the VC 2003 release, we could not introduce the fix into 2003 without a risk of destabilizing the product, so another hot fix was needed.
 
It’s a bug when one of the upper 24 bits of the EAX register is non-zero that the return value is mishandled.

 
Cheers,
Tom Archer
Inside C#,
Extending MFC Applications with the .NET Framework
It's better to listen to others than to speak, because I already know what I'm going to say anyway. - friend of Jörgen Sigvardsson
GeneralRe: A bit more detail PinmemberJohn M. Drescher29-Aug-03 11:40 
GeneralRe: A bit more detail PinmemberTom Archer29-Aug-03 11:50 
GeneralRe: A bit more detail PineditorNishant S29-Aug-03 19:34 
GeneralRe: A bit more detail PineditorNishant S29-Aug-03 19:24 
GeneralRe: A bit more detail Pinmembergraham_k_200428-Sep-04 8:28 
QuestionWhy mixed mode? PinsussAnonymous27-Aug-03 12:09 
"The most common scenario where the bug is reported is when someone has a mixed mode C++ program that has a managed class, which accesses an unmanaged class in an unmanaged DLL."
 
Does the bug exist for code not in mixed mode? If not, avoiding mixed mode might be a good or better solution. Why use mixed mode in the first place? Is there any situation where only mixed mode code can solve the problem? Just curious.
AnswerRe: Why mixed mode? PineditorNishant S27-Aug-03 15:10 
AnswerRe: Why mixed mode? PinmemberTom Archer29-Aug-03 11:18 
GeneralThanks for pointing that out... PinmemberJohn M. Drescher27-Aug-03 9:23 
You would think MS would fix a bug as serious as this...
 
John
GeneralRe: Thanks for pointing that out... PineditorNishant S27-Aug-03 9:28 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130617.1 | Last Updated 3 Sep 2003
Article Copyright 2003 by Nish Sivakumar
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid