Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ C
Hi,
 
We can write main function in several ways (sure there can be only one main in the program),
1. int main()
2. int main(int argc,char *argv[])
3. int main(int argc,char *argv[],char * environment)
 
How run-time CRT function knows which main should be called. Please notice here, I am not asking about Unicode supported or not.
Posted 29-Apr-12 2:55am
Edited 29-Apr-12 3:12am
v2
Comments
SAKryukov at 29-Apr-12 9:04am
   
Do you know a way to write more then one?
--SA
Pranit Kothari at 29-Apr-12 9:07am
   
No, there can be only one main in program.
SAKryukov at 29-Apr-12 9:08am
   
Come to think about it, this question is not clear.
 
Do you mean:
1) Executable module may have more then one functions with the main name and signature. Which one will be called and how this is determined?
 
2) There is only one main with appropriate signature, one of them. How runtime determines which parameters to pass, as the signature is not known.
 
--SA
Pranit Kothari at 29-Apr-12 9:12am
   
Question edited
SAKryukov at 29-Apr-12 9:31am
   
Thank you. In fact, the question is pretty deep, so I voted 5 for it, and the answer is not very simple -- I tried to explain it, please see.
--SA
Pranit Kothari at 29-Apr-12 9:32am
   
Thanks SA.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

I don't know how specific compilers do it but GCC and VC++ come with the source code to their startup routines if you want a look.
 
The trick I used a while ago is for the startup code to not worry how main's been written. It just bungs two words (for a standard compiler) on the stack and jumps to calls the symbol called main. The two words are the same regardless of which of the functions have been written by the programmer. This doesn't require a lot of collusion with the compiler as most C compilers rely on the calling function to clean up the stack after them while other languages rely on the exit code function of the called function to do.
 
Cheers,
 
Ash
 
Just for SA:
 
The assembly language sequence for calling main on most compilers is this (assuming x86 and a compiler that doesn't add vendor specific parameters to the end of main):
push argp
push argc
call main
add  esp, 8
main returns just using a ret instruction.
 
As I said above, bung two words on the stack, call main (like I didn't edit what I said earlier from jump to main, ahem) and then rely on the calling function to clean up the stack.
 
While I'm on the subject there's nothing in the C standard that says a compiler couldn't either stick the words the other way around, AND/OR rely on main to clean up the stack. So there's no reason why a compiler couldn't use a Pascal or Fortran style calling convention (MS __stdcall parameters without the name decoration, the @8 or @12 on some compilers):
push argc
push argp
call main
and rely on main to do a ret 8.
  Permalink  
v4
Comments
Pranit Kothari at 29-Apr-12 9:15am
   
Thanks. 5!
SAKryukov at 29-Apr-12 9:32am
   
I don't think this is the answer. Please see mine -- it explain the essence of it. This is specific to C, C++ and the parameter passing convention.
--SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Thank you for clarification of the question on my request.
 
Yes, if you try to create more then one, the development environment should give you an error like "overloading of main is not allowed" or something.
 
So, how the run-time system knows what signature to use while calling the method? In general case, no reflection is possible (like in .NET), so how to determine what parameters to pass?
 
Short answer is: it is possible due to __cdecl, by passing maximum number of parameters in all cases and always using return value.
 
First, pay attention that all signatures are compatible: more "advanced" signature only adds additional parameters:
 
int main(void)
int main(int argc, char **argv)
int main(int argc, char *argv[]) //same as before
int main(int argc, char **argv, char **envp)
 
(http://en.wikipedia.org/wiki/Main_function#C_and_C.2B.2B[^])
 
The trick is here: the argument passing conventions. It is always __cdecl. This is the only convention when the stack cleanup is done by the caller, not the callee:
http://msdn.microsoft.com/en-us/library/984x0h58%28v=vs.71%29.aspx[^].
 
In this case, the caller can pass all the parameters as in the case of the longest signature and always returns int value. As the return value is returned in the CPU register, it does not disrupt the operation on the called function does not use it: it is simply ignored. All the parameters are available on the stack, so the called function only uses as many as its signature allows; all other are just ignored. As the stack is cleaned up by the caller (run-time system), it does not need to know what was the actual signature of the callee — all parameters are passed, and, respectively, all are removed by the caller.
 
Try to specify the calling conventions explicitly:
int __cdecl main(int argc, char **argv)
It will compile and link. Try to use __stdcall or any other — the linker will fail. You can only use __cdecl. This is designed so by the reasons I explained above.
 
—SA
  Permalink  
v2
Comments
Pranit Kothari at 29-Apr-12 9:35am
   
Thanks SA for such comprehensive answer(as expected). 5!
SAKryukov at 29-Apr-12 16:47pm
   
You are very welcome, Pranit.
--SA
pwasser at 29-Apr-12 10:13am
   
I like this explanation.
SAKryukov at 29-Apr-12 16:47pm
   
Thank you.
--SA
Espen Harlinn at 29-Apr-12 17:18pm
   
Good answer Sergey :-D
SAKryukov at 29-Apr-12 17:20pm
   
Thank you, Espen.
--SA
Aescleal at 29-Apr-12 18:08pm
   
The linker fails when you specify main as __stdcall because MS compilers decorate the name of a __stdcall function with the number of bytes cleaned off the stack when the function returns. There's no reason why you couldn't have a compiler using a __stdcall style (as in parameter passing method, not name decoration). You can see this by using a binary editor to change the symbol for a stdcall main (from _main@12 to _main) or by using either GCC or Digital Mars that support aliasing through #pragmas. Your program will then link but crash when it returns from main or (probably) when you use the argv parameter.
 
Anyway you hit the salient points I was aiming for so have a 5!
 
Quick Edit: I've just knocked up two minimal startups for VC++, one expecting main as __cdecl and the other expecting main as __stdall and they both linked an worked. Slightly surprisingly as it wouldn't surprise me if the compiler had done something a bit special with main. There again I was using the C compiler which doesn't have things like implicit return 0 the way the C++ compiler does.
SAKryukov at 29-Apr-12 18:55pm
   
The main reason why __stdcall cannot be used is still a need to push parameters and pop them by the caller, to allow for variable number of parameters. In C, this kind of decoration is not used, but the problem is exactly the same.
How did you make __stdcall working, how about some detail? I have the linker failed. I was sure __cdecl would work because I knew about the role of this calling convention in variable-size parameter. To me, failure to link with anything else is expected correct behavior; I don't care who it could be implemented.
Anyway, thank you very much.
--SA
Aescleal at 30-Apr-12 8:26am
   
To get the __stdcall version of main to link I used an assembler to create an ALIAS for _main@12 (in the VC++ case) to _main. That worked without mangling the run time but crashed if you accessed envp, argv was correct as it was the middle one of three arguments.
 
To get it working rather than crashing I modified the (actually wrote a small lump of) startup code to push the parameters in reverse order and then call _main@12 rather than mess about with an ALIAS for _main.
 
So as I said, it's the name mangling that stops the linker failing. Get rid of that with an ALIAS and it'll link, and probably fail horribly (because as you've pointed out, the parameters are in the right order).
 
There is a way of getting __stdcall main() to not only link and run but work properly with VC++ out of the box (if you've got one with an assembler in the box). The way I did it was populate a table of three thunks, each of which stacked the correct number of parameters and then called the appropriate __stdcall main (either _main@4, _main@8, _main@12). Using the order I specified object files to the linker I could make sure that the programmer written main was entered into a thunk while the other two were populated by stub functions from another object file. The final bit is working out which one to run and (this is the really disgusting bit, I'm not proud) you have all the stub functions overlaying each other - i.e. having the same address. Then you can compare a function address in the entry and call it if it's not the stub.
 
The scary thing was that the linker did most of the heavy lifting, it was only 20 lines of assembly (including the table definitions).
 
Anyway, this was the sort of trick I've learned over the years programming devices and using C in non-hosted environments. You get used to this sort of disgusting hack - and I sure as hell wouldn't advocate using it in a hosted environment.
SAKryukov at 30-Apr-12 18:48pm
   
Yes, that I understand. Thank you for sharing the interesting detail. It could be a separate answer or a short article. (However, I would not know any practical application of this information, even though this is good to know.
--SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

The question is indeed a good one. If the main function was compiled as a regular C++ function, its name would be mangled and the runtime system had no way to know, which of the various signatures were provided by the user.
 
Here is how VC++ does it: When looking at the linker map you see that the main function is listed as "_main", thus as a good old C function, without name mangling. That is how the runtime finds it. It just looks for a function called "main".
 
As for the parameters that are being passed, SA and Ash have already explained that part in depth. The runtime just pushes them on the stack or transfers them in registers and cleans up afterwards. If main just ignores them - fine.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 BillWoodruff 314
1 Sergey Alexandrovich Kryukov 290
2 George Jonsson 274
3 CPallini 255
4 OriginalGriff 237
0 OriginalGriff 4,895
1 CPallini 4,010
2 Sergey Alexandrovich Kryukov 3,514
3 George Jonsson 2,826
4 Gihan Liyanage 2,386


Advertise | Privacy | Mobile
Web04 | 2.8.140916.1 | Last Updated 29 Apr 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100