Introduction
Inline Assembly is different in VC++ and gcc. VC uses Intel syntax while gcc
uses AT&T syntax. Here we define the difference in syntax of AT&T and
Intel's assembly.
Source and Destination Ordering
In AT&T syntax the source is always on the left, and the destination is
always on the right which is opposite of the Intel's syntax.
|
AT&T |
Intel |
Move ebx to eax |
movl %ebx, %eax |
mov eax, ebx |
Move 100 to ebx |
Movl $100, %ebx |
Mov ebx,
100 |
Prefixes/Suffixes for register naming and Immediate Values
Register names are prefixed with "%" in AT&T while in Intel syntax they
are referenced as is. In AT&T syntax $ is prefixed to all the immediate
values while no prefix is required in Intel's format. In Intel syntax
hexadecimal or binary immediate data are suffixed with 'h' and 'b' respectively.
Also if the first hexadecimal digit is a letter then a '0' prefixes the
value.
In AT&T the instructions are suffixed by b, w, or l, depending on
whether the operand is a byte, word, or long. This is not mandatory as GCC tries
to provide the appropriate suffix by reading the operands. But it is recommended
to provide the suffixes as it improves code readability and it prevents compiler
from making a mistake while guessing. The equivalent forms for Intel is byte
ptr, word ptr, and dword ptr only when referencing memory
|
AT&T |
Intel |
Registers |
%eax %ebx %ecx %edx |
eax ebx ecx edx |
Move ebx to eax |
movl %ebx, %eax |
mov eax, ebx |
Move FF in ebx |
movl $0xff,%ebx |
mov ebx,
0ffh |
Accessing variables from inside Inline ASM
For accessing values of global variables an underscore has to be prefixed in
both cases. Lets say x is a global or static variable. In Intel's format _x
gives pointer to the variable while [_x] gives its value. In AT&T format $_x
gives the pointer while _x gives the value of the variable. This works only for
global variables. Local variables can be accessed using the stack pointer. In
GCC, extended inline assembly can be used to have variables preloaded into
registers and the result of your assembly operations stored into variables.
(Extended ASM in GCC will be discussed in my next article).
|
AT&T |
Intel |
Load value of x in eax |
movl _x, %eax |
mov eax, [_x] |
Load pointer to x in ebx |
movl $_x, %ebx |
mov ebx,
_x |
Accessing Local Variables
Local variables as well as function parameters are allocated on the stack. In
inline asm you can access those local variables either using frame pointer
register(ebp) or directly the stack pointer.
Local variables are on the
negative side of the stack pointer while function parameters are on the positive
side of the stack pointer. Space for local variables is reserved on the stack in
the order that they are declared.
Parameters are pushed onto the stack from
right to left and are referenced relative to the base pointer (ebp) at four byte
intervals beginning with a displacement of 8
Lets see an example
For the
following C function
void doNothing(int a, int b)
{
int y = 5;
int z = 9;
y += b;
z += a;
return;
}
Following asm code is generated by the compiler (without optimizations)
_doNothing:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl $5, -4(%ebp)
movl $9, -8(%ebp)
movl 12(%ebp), %edx
leal -4(%ebp), %eax
addl %edx, (%eax)
movl 8(%ebp), %edx
leal -8(%ebp), %eax
addl %edx, (%eax)
leave
ret
The stack looks something like this
z |
-8 |
-7 |
-6 |
-5 |
y |
-4 |
-3 |
-2 |
-1 |
|
ebp |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
a |
8 |
9 |
10 |
11 |
b |
12 |
13 |
14 |
15 |
Following asm code is generated if we tell compiler not to use ebp
(-fomit-frame-pointer)
_doNothing:
subl $8, %esp
movl $5, 4(%esp)
movl $9, (%esp)
movl 16(%esp), %edx
leal 4(%esp), %eax
addl %edx, (%eax)
movl 12(%esp), %edx
movl %esp, %eax
addl %edx, (%eax)
addl $8, %esp
ret
In this case stack looks something like this
z |
esp |
1 |
2 |
3 |
y |
4 |
5 |
6 |
7 |
|
8 |
9 |
10 |
11 |
a |
12 |
13 |
14 |
15 |
b |
16 |
17 |
18 |
19 |
Referencing memory
The format for 32-bit addressing in Intel syntax is segreg: [base + index *
scale + immed32] while the same in AT&T syntax is segreg: immed32 (base,
index, scale). The formula to calculate the address is (immed32 + base + index *
scale) in the segment pointed by segment register. You can use 386-protected
mode to just use eax, ebx, ecx, edx, edi, esi as six general purpose registers
(ignoring the segment register). ebp can also be used as a general purpose
register if -fomit-frame-pointer option is used while compiling.
|
AT&T |
Intel |
Addressing a global variable |
_x |
[_x] |
De-Referencing a pointer in a register |
(%eax) |
[eax] |
Array of integers |
_array (,%eax, 4) |
[eax*4 + array] |
Addressing a variable offset by a value in a
register |
_x(%eax) |
[eax + _x] |
Addressing a variable offset by an immediate value
assuming that variable is stored in eax |
1(%eax) |
[eax +
1] |