Click here to Skip to main content
15,867,453 members
Please Sign up or sign in to vote.
3.00/5 (1 vote)
See more:
Hello,
I have a general question regarding how a C/C++ compiler will handle generating machine code with respect to a struct.

Lets compare the generated code for:
C++
int Var1;
int Var2;
bool Var3;

cout << Var1 << endl;


vs.

C++
struct MyStruct
{
int Var1;
int Var2;
bool Var3;
};
MyStruct my;
cout << my.Var1;


Will the compiler generate different machine code for those two blocks of code? I've been reading that it does not, but have never seen a clear-cut yes-or-no answer on it.

Also, typedef is only for internal purposes of the compiler and will not cause it to generate any machine code, correct?

Thank you!
Posted

The two pieces of the code are fundamentally different. And this is quite clear.

Your 3 variables statically occupy 3 slots in memory after the code is loaded. As to the struct, it occupies none, but instantiation of MyStruct does get memory (it's not the fact that exactly the same, as it depends on memory layout, which can be a compiler option and depend on compiler implementation). The code for initialization of the instance of MyStruct also depends on context. Usually, the code you show is written inside a function, it means that the instance will be stored on stack. The stack will pop after return, so the memory occupied by the structure instance can be re-used in other stack frame.

In contrast, individual variables will be initialized in the memory allocated in the very beginning for static data. You could more closely simulate this operation, if you struct had static members. How much similar those cases could be. Again, it may depend on a particular compiler and its options — structure layout, alignment in memory, things like that.

—SA
 
Share this answer
 
v3
Comments
Ron Beyer 12-Jun-13 8:05am    
Yup, +5
Sergey Alexandrovich Kryukov 12-Jun-13 12:54pm    
Thank you, Ron.
—SA
CPallini 12-Jun-13 13:17pm    
My 5.
Sergey Alexandrovich Kryukov 12-Jun-13 14:44pm    
Thank you, Carlo.
—SA
Try checking the assembly output for yourself. Going through the assembly generated by a compiler can be quite interesting.

For example, in gcc you can use the -S command line option
With Visual studio, go to project properties->Configuration Properties->C/C++->Output Files and set the 'Assembler output' option.

for instance, for code like:
C++
int a;
int b;

struct c
{
	int a;
	int b;
};

.
.
.

struct c cs;
a = 1;
b = 2;
cs.a = 3;
cs.b = 4;


My Visual studio 2008 express generated the following asm code:

C++
; 18   : 	struct c cs;
; 19   : 	a = 1;
	mov	DWORD PTR ?a@@3HA, 1			; a
; 20   : 	b = 2;
	mov	DWORD PTR ?b@@3HA, 2			; b
; 21   : 	cs.a = 3;
	mov	DWORD PTR _cs$[ebp], 3
; 22   : 	cs.b = 4;
	mov	DWORD PTR _cs$[ebp+4], 4


One obvious difference we can notice is that the addressing mode for variables 'a' and 'b' is different than the one used for the struct variable 'cs'.
 
Share this answer
 
v2
Comments
Sergey Alexandrovich Kryukov 12-Jun-13 12:54pm    
No explanation why is it so. It looks like you mechanically pasted disassembled code, so what?
—SA
parths 12-Jun-13 13:53pm    
I wanted to point out that the assembly listings generated by compilers can actually be viewed.
I learned what little I know about assembly by looking at compiler listings and check google (also a lot by reading Art of Assembly).
Should I have given reference links?

Exactly what kind of explanation would you expect? I'll try to improve my answer.

The behaviour is compiler dependent and theoretically the compiler can actually implement both cases in a similar fashion, right?
For instance _cs$[ebp] seems to indicate the compiler is using an indexed addressing mode (similar to arrays), but it can as well have done something like computing the address, storing in a register and using register indirect addressing. It's not optimal, but theoretically correct, right?

[Modified]I just noticed, nv3 posted a similar explanation. Is that the kind of explanation you are looking for?[/Modified]
Sergey Alexandrovich Kryukov 12-Jun-13 15:30pm    
Actually, viewing the assembly listing is itself a good point, OP might miss it, I don't know.

As to the essence of things, I tried to explain the fundamental difference in code without even looking at assembly listing, I think this is pretty obvious, please see.
—SA
Quote:
Will the compiler generate different machine code for those two blocks of code? I've been reading that it does not
Quite the opposite: it probably does and you may check yourself having a look at the machine code (or assembly) produced by the compiler itself.


Quote:
Also, typedef is only for internal purposes of the compiler and will not cause it to generate any machine code, correct?
Pretty correct, typedef creates an alias for the exact type name.
 
Share this answer
 
Comments
nv3 12-Jun-13 8:44am    
I fully agree. +5
It might be worth mentioning that as long as memory location of the structure is known by the compiler at compile time, it can just produce a move instruction with the corresponding address. However, if the structure is referred to by a pointer, the compiler will generate code to first deference the pointer and then add the offset of the structure member. Hence, in that case it will generate slightly slower code. The difference in speed is however small. If someone argues for not using structures, but instead using single variable, this is normally not a valid argument. I just mention that as the question sounded a little like that.
CPallini 12-Jun-13 10:51am    
Thank you and have my (virtual) 5.
zlogdan 12-Jun-13 12:15pm    
One also benefits from using structs while passing arguments to a function. My 5. Virtual 5 to nv3 too.
Sergey Alexandrovich Kryukov 12-Jun-13 12:59pm    
Correct about typedef, but at to the main part of the question: you could explain why the code is always different. I'm pretty much sure you knew that. (I tried to explain it, please see. Are you agree?)
I voted 4 this time, OK?
—SA
CPallini 12-Jun-13 13:13pm    
It is OK, thank you.
About your answer:
(1) - I suppose (I hope at least) the OP knows the difference between declaration and actual instantiation.
(2) - You are correct about struct alignment, but, in my opinion, you should elaborate.
In release build the compiler may unwrap the struct to variables.

Which is depending on the individual compiler and its settings like => "loop enrollment"

The LLVM from Apple is constructing extra loops with special register operations.

A good optimizer would check whether the Var1 is only assigned once and than throw all unneeded stuff into dev0 :-O
 
Share this answer
 
Comments
Richard MacCutchan 14-Jul-14 4:11am    
You are 13 months late!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900