Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ C compiler
Hello,
I have a general question regarding how a C/C++ compiler will handle generating machine code with respect to a struct.
 
Lets compare the generated code for:
int Var1;
int Var2;
bool Var3;
 
cout << Var1 << endl;
 
vs.
 
struct MyStruct
{
int Var1;
int Var2;
bool Var3;
};
MyStruct my;
cout << my.Var1;
 
Will the compiler generate different machine code for those two blocks of code? I've been reading that it does not, but have never seen a clear-cut yes-or-no answer on it.
 
Also, typedef is only for internal purposes of the compiler and will not cause it to generate any machine code, correct?
 
Thank you!
Posted 11-Jun-13 14:57pm
p4p4p4568
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

The two pieces of the code are fundamentally different. And this is quite clear.
 
Your 3 variables statically occupy 3 slots in memory after the code is loaded. As to the struct, it occupies none, but instantiation of MyStruct does get memory (it's not the fact that exactly the same, as it depends on memory layout, which can be a compiler option and depend on compiler implementation). The code for initialization of the instance of MyStruct also depends on context. Usually, the code you show is written inside a function, it means that the instance will be stored on stack. The stack will pop after return, so the memory occupied by the structure instance can be re-used in other stack frame.
 
In contrast, individual variables will be initialized in the memory allocated in the very beginning for static data. You could more closely simulate this operation, if you struct had static members. How much similar those cases could be. Again, it may depend on a particular compiler and its options — structure layout, alignment in memory, things like that.
 
—SA
  Permalink  
v3
Comments
Ron Beyer at 12-Jun-13 8:05am
   
Yup, +5
Sergey Alexandrovich Kryukov at 12-Jun-13 12:54pm
   
Thank you, Ron.
—SA
CPallini at 12-Jun-13 13:17pm
   
My 5.
Sergey Alexandrovich Kryukov at 12-Jun-13 14:44pm
   
Thank you, Carlo.
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Try checking the assembly output for yourself. Going through the assembly generated by a compiler can be quite interesting.
 
For example, in gcc you can use the -S command line option
With Visual studio, go to project properties->Configuration Properties->C/C++->Output Files and set the 'Assembler output' option.
 
for instance, for code like:
int a;
int b;
 
struct c
{
	int a;
	int b;
};
 
.
.
.
 
struct c cs;
a = 1;
b = 2;
cs.a = 3;
cs.b = 4;
 
My Visual studio 2008 express generated the following asm code:
 
; 18   : 	struct c cs;
; 19   : 	a = 1;
	mov	DWORD PTR ?a@@3HA, 1			; a
; 20   : 	b = 2;
	mov	DWORD PTR ?b@@3HA, 2			; b
; 21   : 	cs.a = 3;
	mov	DWORD PTR _cs$[ebp], 3
; 22   : 	cs.b = 4;
	mov	DWORD PTR _cs$[ebp+4], 4
 
One obvious difference we can notice is that the addressing mode for variables 'a' and 'b' is different than the one used for the struct variable 'cs'.
  Permalink  
v2
Comments
Sergey Alexandrovich Kryukov at 12-Jun-13 12:54pm
   
No explanation why is it so. It looks like you mechanically pasted disassembled code, so what?
—SA
parths at 12-Jun-13 13:53pm
   
I wanted to point out that the assembly listings generated by compilers can actually be viewed.
I learned what little I know about assembly by looking at compiler listings and check google (also a lot by reading Art of Assembly).
Should I have given reference links?
 
Exactly what kind of explanation would you expect? I'll try to improve my answer.
 
The behaviour is compiler dependent and theoretically the compiler can actually implement both cases in a similar fashion, right?
For instance _cs$[ebp] seems to indicate the compiler is using an indexed addressing mode (similar to arrays), but it can as well have done something like computing the address, storing in a register and using register indirect addressing. It's not optimal, but theoretically correct, right?
 
[Modified]I just noticed, nv3 posted a similar explanation. Is that the kind of explanation you are looking for?[/Modified]
Sergey Alexandrovich Kryukov at 12-Jun-13 15:30pm
   
Actually, viewing the assembly listing is itself a good point, OP might miss it, I don't know.
 
As to the essence of things, I tried to explain the fundamental difference in code without even looking at assembly listing, I think this is pretty obvious, please see.
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

Quote:
Will the compiler generate different machine code for those two blocks of code? I've been reading that it does not
Quite the opposite: it probably does and you may check yourself having a look at the machine code (or assembly) produced by the compiler itself.
 

Quote:
Also, typedef is only for internal purposes of the compiler and will not cause it to generate any machine code, correct?
Pretty correct, typedef creates an alias for the exact type name.
  Permalink  
Comments
nv3 at 12-Jun-13 8:44am
   
I fully agree. +5
It might be worth mentioning that as long as memory location of the structure is known by the compiler at compile time, it can just produce a move instruction with the corresponding address. However, if the structure is referred to by a pointer, the compiler will generate code to first deference the pointer and then add the offset of the structure member. Hence, in that case it will generate slightly slower code. The difference in speed is however small. If someone argues for not using structures, but instead using single variable, this is normally not a valid argument. I just mention that as the question sounded a little like that.
CPallini at 12-Jun-13 10:51am
   
Thank you and have my (virtual) 5.
zlogdan at 12-Jun-13 12:15pm
   
One also benefits from using structs while passing arguments to a function. My 5. Virtual 5 to nv3 too.
Sergey Alexandrovich Kryukov at 12-Jun-13 12:59pm
   
Correct about typedef, but at to the main part of the question: you could explain why the code is always different. I'm pretty much sure you knew that. (I tried to explain it, please see. Are you agree?)
I voted 4 this time, OK?
—SA
CPallini at 12-Jun-13 13:13pm
   
It is OK, thank you.
About your answer:
(1) - I suppose (I hope at least) the OP knows the difference between declaration and actual instantiation.
(2) - You are correct about struct alignment, but, in my opinion, you should elaborate.
Sergey Alexandrovich Kryukov at 12-Jun-13 15:37pm
   
Thank you, Carlo.
Good points. I hope so, too. If not — our comments may give OP the idea to ask a follow-up question for clarification of this matter.
As to struct alignment, I did not elaborate only because I really consider it as the second order of negligibility (but is just worth mentioning), compared to importance of declaration vs. instantiation, static vs. instance, and stack storage vs. static memory. I think I can answer it OP expresses some interest in finer detail...
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 4

In release build the compiler may unwrap the struct to variables.
 
Which is depending on the individual compiler and its settings like => "loop enrollment"
 
The LLVM from Apple is constructing extra loops with special register operations.
 
A good optimizer would check whether the Var1 is only assigned once and than throw all unneeded stuff into dev0 Blush | :O
  Permalink  
Comments
Richard MacCutchan at 14-Jul-14 4:11am
   
You are 13 months late!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 7,903
1 Sergey Alexandrovich Kryukov 7,192
2 DamithSL 5,604
3 Manas Bhardwaj 4,986
4 Maciej Los 4,820


Advertise | Privacy | Mobile
Web03 | 2.8.1411023.1 | Last Updated 14 Jul 2014
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100