An x86 assembler with register allocation






4.84/5 (22 votes)
It is made for a machine code generator, but it can be used separately.
Introduction
This article is about an x86 assembler I made for my programming language project. It consists of some separated libraries. A register allocator is integrated into it that can help much when writing bigger x86 assembly or making a code emitter. Currently it only supports the general instruction set, but my plan is to extend it soon with SSE instruction sets.
Writing x86 assembly difficulties
If you know x86 well, you may skip this section. The first problem is that only one operand can contain a memory operand. It is because the instruction encoding only allows one.
mov eax, dword[esp + 8]
mov dword[edx] eax
And the problem is increased by the fixed amount of registers (eax
,
edx
, ..) which is normal.
Register allocation means to assign registers to variables. It's not easy, because
if you choose x variables to be in registers, the others are in memory, and can cause
a temporary register need, so now you have less than x registers.
The second problem is that there are a lot of instruction with exceptions, some only allow register operands as first operand, some only memory etc. It really looks like that instruction set is obsolete, it has been being developed for a long time. Intel has tried to replace it several times, but it was unsuccessful.
The third problem is pre-coloring, it means that some instruction doesn't allow
selecting the operand, it will use only given registers for inputs or outputs. For
example a 32 bit div instruction takes the dividend from eax
and
edx
, and outputs
the result to eax
and the remainder is stored in edx
.
mov eax, (dividend)
mov edx, 0 ; unsigned extension to 8 bytes
div (divisor)
mov (quotient), eax
mov (remainder), edx
The command line tool
It is implemented in the x86Assembler project in the download file:
Bird x86 assembler
Usage: x86Assembler -i<input> -o<output>
i, input Required. Input file to be assembled.
s, asm_out Assembly created by register allocation output file.
o, obj_out Required. The generated object file name.
arch (Default: x86_32) Architecture mode for the assembly.
help Display this help screen.
Syntax
Basics
I have used a Flat assembler before, so the syntax builds on it. The elements have to be inside a
section. To enable the register allocation and other transformations to the code you need
to place the function between function
and endfunction
:
section "code" code executable readable
_Fib:
function
save ebp, ebx, edi, esi
def x, general, 4
def _1, general, 4
def _2, general, 4
fenter
mov x, eax
cmp x, 2
jge _3
mov eax, 1
fleave
_3:
mov _1, x
dec _1
mov eax, _1
read eax
call _Fib
write eax, edx, ecx
mov _2, eax
mov _1, x
sub _1, 2
mov eax, _1
read eax
call Fib
write eax, edx, ecx
add eax, _2
fleave
endfunction
The function starts with some definitions. It needs to know which registers need to be
saved by the fenter
and fleave
instructions. The def
creates a symbol that
can be used for data storage. First parameter is name, second type, third is size. The
general
means that it will be a general register (or a stack location).
The assembler does live range analysis, but some it needs some additional information for
call instructions. read
marks that the value will be used, write
means that the original value has been overwritten.
The previous code will be transformed to this:
_Fib:
function
push ebp
push ebx
mov ebp, eax
cmp ebp, 2
jge _3
mov eax, 1
pop ebx
pop ebp
ret
_3:
mov eax, ebp
dec eax
call _Fib
mov ebx, eax
mov eax, ebp
sub eax, 2
call _Fib
add eax, ebx
pop ebx
pop ebp
ret
endfunction
This is how exporting and importing symbols work, these definitions need to be placed in the beginning of the assembly file:
public _Func2
public _Fib
extern _Func
section "code" code executable readable
Using stack
A function has two type of stack space. The space for the locals and the parameters.
locals_size
and parameters_size
can specify their size.
For user defined symbols you don't need to create local stack, when a symbol doesn't fits
in registers, it'll automatically increase the local stack size to create a place for it.
If the function uses a calling convention that needs to clean the stack, the
callee_cleanup
keyword can be used.
The stack can be accessed by register aliases. It is needed, because sometimes it can
refer to different registers. eps
is for parameter stack, els
refers to the local stack. The e character means that it is 32 bit register. ps
would be 16 bit, rps
64 bit.
_Function:
function
locals_size 8
parameters_size 4
callee_cleanup
...
mov eax, dword[eps]
mov dword[els + 4]
...
endfunction
You may have seen that compilers copies esp
to ebp
. By default
my assembler will do this if there is a local stack. In my
experience it can boost performance.
There are some keywords that can be used to overwrite this:
frame_pointer_force
, frame_pointer_disable
,
frame_pointer_default
.
Stack alignment
Sometimes you need to store aligned data on the stack, e.g. SSE instructions requires 16 byte alignment:
_Func:
function
def x, general, 4
stack_align 16
parameters_size 4
fenter
mov x, dword[eps]
add x, 34
fleave
endfunction
After extecuting the assembler:
_Func:
function
push ebx
mov ebx, esp
and esp, 4294967280
mov eax, dword[ebx + 8]
add eax, 34
mov esp, ebx
pop ebx
ret
endfunction
The alignment is done by the and
instruction on the stack pointer. It
decreases the pointer, so it allocates some extra byte. The problem is that it is unknown
what will be the offset to parameters and the caller function stack. Thus it needs
to save that to the parameter pointer (I just call it that). It is always the
ebx
register. As well as for the frame pointer it can be forced or disabled:
parameter_pointer_force
, parameter_pointer_disable
,
parameter_pointer_default
.
Using the code
Encoding an instruction
The project Bird.Architecture.x86
handles the instruction encoding task.
I learnt machine code at
http://www.c-jump.com/CIS77/CPU/x86/lecture.html.
I will encode the add ax, cx
instruction. Firstly a context
should be created:
var context = new x86Context(ArchitectureMode.X86_32);
Next I create InstructionToEncode
structure. It selects the instruction
and operands. Mnemonic
, ConditionTest
and Register
are enum types. ConditionTest
can be only have a value if the
instruction is conditional (Mnemonic.Jc
, Mnemonic.Setcc
, etc.).
var instructionToEncode =
new InstructionToEncode(
Mnemonic.Add,
ConditionTest.Unknown,
new Operand[]
{
Operand.Register(Register.AX),
Operand.Register(Register.CX),
});
And I call the x86CodeGeneration.Encode
function:
EncodedInstruction encodedInstruction;
EncodingError error = x86CodeGeneration.Encode(
context,
instructionToEncode,
out encodedInstruction);
if (error != EncodingError.Succeeded)
throw new Exception("Failed to encode instruction");
It created a EncodedInstruction
structure which contains prefixes,
memory operand bytes, etc:
[Flags]
public enum EncodedInstructionPrefixes
{
AddressSizePrefix = 1,
OperandSizePrefix = 2
}
public struct EncodedInstruction
{
public EncodedInstructionPrefixes Prefixes { get; set; }
public REXPrefix? REXPrefix { get; set; }
public InstructionOpcode Opcode { get; set; }
public ModRegRMByte? ModRegRMByte { get; set; }
public SIBByte? SIBByte { get; set; }
public VariableSizeSigned? Displacement { get; set; }
public VariableSizeSigned? Immediate { get; set; }
}
By calling ToByteArray
method, it can be converted to machine code bytes:
var bytes = encodedInstruction.ToByteArray();
foreach (var b in bytes)
Console.WriteLine(b);
The x86Assembler class
The Operand.Immediate
only allows a number value, labels are not allowed.
That's because in machine code the assembler needs to emit relative or direct addresses.
Each instruction has an address, direct means the destionation address, relative means
the difference between destination and current address.
It needs to know all the instruction sizes, and the instruction can grow bigger as the
relative addresses are farther. The solution is to use multiple passes.
mov ecx, 0
_label:
inc ecx
cmp ecx, 10
jl _label
This is a small for loop assembly code without anything in it. I've created a context,
assembler and a symbol. There are 3 types of symbol: Local symbols only visible
in the current assembly, public symbols are exported so other obj
files
can use it, external will be used from other file.
var context = new x86Context(ArchitectureMode.X86_32);
var assembler = new x86Assembler(context);
var label = new x86AssemblerSymbol("_label", CodeObjectSymbolType.Local);
The x86Assembler
class functions to make members, I haven't made the
data structure public. It uses LabelledOperand
structure for operand,
the only difference is that it allows labels too.
assembler.BeginCodeSection();
assembler.AddInstruction(
Mnemonic.Mov,
ConditionTest.Unknown,
new LabelledOperand[]
{
LabelledOperand.Register(Register.ECX),
LabelledOperand.Immediate(0),
});
assembler.AddLabel(label);
assembler.AddInstruction(
Mnemonic.Inc,
ConditionTest.Unknown,
new LabelledOperand[]
{
LabelledOperand.Register(Register.ECX),
});
assembler.AddInstruction(
Mnemonic.Cmp,
ConditionTest.Unknown,
new LabelledOperand[]
{
LabelledOperand.Register(Register.ECX),
LabelledOperand.Immediate(10),
});
assembler.AddInstruction(
Mnemonic.Jcc,
ConditionTest.Less,
new LabelledOperand[]
{
LabelledOperand.Immediate(label),
});
assembler.EndSection();
Next I'll create a CodeObject
that contains the assembly binary.
It can be saved to file by CoffObjectSaver
class. I use
http://onlinedisassembler.com/ to
verify that the result is correct.
CodeObject codeObject;
var errorList = assembler.CreateCodeObject(out codeObject);
if (errorList.Count > 0)
throw new Exception("Failed to create code object");
CoffObjectSaver.Instance.Save("test.obj", codeObject);
The next layer
Bird.Architecture.x86.Assembly
allows register allocation, parsing etc.
It has a tree that contains an immutable class for every element in the assembly. I'll show
how to add two numbers that are stored in undefined storages. UndefinedStorage
,
ConstantImmediate
, RegisterStorage
, etc classes inherits the
DataStorage
class. TrackedString
structure is used for the parser,
its purpose is to store the offset to the original string while getting substrings. So
here I created the instructions:
var context = new x86Context(ArchitectureMode.X86_32);
// parameters: name, type, size, alignment, priority
var undefinedA = new UndefinedSymbol("a", RegisterType.General, 4, 4, 0);
var undefinedB = new UndefinedSymbol("b", RegisterType.General, 4, 4, 0);
var elements = new AssemblyElement[]
{
new AssemblyInstruction(
Mnemonic.Mov,
ConditionTest.Unknown,
ImmutableArray.Create<DataStorage>(
new UndefinedStorage(undefinedA, TrackedString.Invalid),
new ConstantImmediate(2, TrackedString.Invalid)),
TrackedString.Invalid),
new AssemblyInstruction(
Mnemonic.Mov,
ConditionTest.Unknown,
ImmutableArray.Create<DataStorage>(
new UndefinedStorage(undefinedB, TrackedString.Invalid),
new ConstantImmediate(3, TrackedString.Invalid)),
TrackedString.Invalid),
new AssemblyInstruction(
Mnemonic.Add,
ConditionTest.Unknown,
ImmutableArray.Create<DataStorage>(
new UndefinedStorage(undefinedA, TrackedString.Invalid),
new UndefinedStorage(undefinedB, TrackedString.Invalid)),
TrackedString.Invalid),
};
I use ImmutableArray
because System.Collections.Immutable
only
has an ImmutableList
class and it hasn't got as good indexer performance.
Because of design issues they removed ImmutableArray
for now.
Here is how to construct an AssemblyFunction
. It has elements and local
symbols. The FunctionProperties
class can be used to specify the saved
registers, local stack size etc. I just leave it empty now.
var function = new AssemblyFunction(
ImmutableArray.CreateRange(elements),
ImmutableArray.Create<AssemblySymbol>(undefinedA, undefinedB),
new FunctionProperties(),
TrackedString.Invalid);
Next I create the section. It has the same
CodeObjectSectionProperties
structure as created by
x86Assembler.BeginCodeSection
.
var section = new AssemblySection(
ImmutableArray.Create<AssemblyElement>(function),
"code",
new CodeObjectSectionProperties(
CodeObjectSectionType.Code,
CodeObjectSectionFlags.Readable |
CodeObjectSectionFlags.Writeable),
TrackedString.Invalid);
The AssemblyTree
class holds all sections and symbols that
are outside of the function.
var tree = new AssemblyTree(
ImmutableArray.Create<AssemblySection>(section),
ImmutableArray.Create<AssemblySymbol>(),
TrackedString.Invalid);
And some transformations should be run and the CodeObject
can be created
similarly as before:
var error = AssemblyActions.DoDefaultTransform(context, ref tree);
if (!error.IsSucceeded)
throw new Exception("Failed to do default transformation");
CodeObject codeObject;
var errorList = AssemblyActions.CreateCodeObject(
context, tree, out codeObject);
if (errorList.Count > 0)
throw new Exception("Failed to create code object");
CoffObjectSaver.Instance.Save("test2.obj", codeObject);