To use microcontrollers, it’s important to understand what happens to the code you write. Most beginners will write code in C and compile it using a provided toolchain.
Since the toolchain is often set up by the vendor, little knowledge is needed to build a program and run it. However when things go wrong, an understanding of how the whole system
works is very helpful for debugging. Let's look at the components involved and see how they work.
This is what you want to run your code on. It has a processor core and peripherals. Your code will run on the core, and it can control the peripherals
to do things like control digital inputs and outputs, send and receive serial data, and analogue to digital conversion. We’re just interested in the core
for now. My goal is to provide a (very) high level overview of how this core works.
A processor core has a list of supported instructions. These are actions such as add, subtract, and compare. Each of these instructions is represented
by a specific binary value (a number) called an opcode.
Inside the processor core are registers, which can store a value. The various instructions that the microcontroller can perform will read and write to these registers.
For example, the instruction ADD R0, R1, R2 adds R1 to R2 and puts the result in R0 (in ARM assembly).
There are also special registers. One of these is the program counter (PC). This register stores the memory address of the instruction that is currently being executed.
It starts at 0 and usually increments by 1 each time an instruction is executed, however certain instructions can cause it to change. One of these instructions is jump,
which sets the program counter to a specified value.
The processor has access to several sets of memory, depending on the design. The program memory is where all the instructions are stored. The processor reads an instruction
from the location in program memory specified by the PC, executes it, and increments the program counter.
Here’s the important bits. The microcontroller’s processor core has memory that the program is loaded into. When the microcontroller is turned on, the PC is set to zero.
The core will load the instruction at address 0, execute it, and then increment the PC to 1. Then the core will load the instruction at 1 and so on. The value of the PC
can be changed by instructions such as jump.
The basic principle of the compiler is simple. It takes code that’s written in some language and converts it to assembly. The most commonly used language for microcontrollers is C.
C is a standard language, but assembly language varies depending on the microcontroller. This is because each assembly instruction corresponds to an opcode,
and the opcodes are different on different devices. Once you learn C, you can write code for any device with a C compiler, but to write assembly you must learn the assembly
instructions for each device.
The details of how a compiler works is beyond the scope of this article. The important part to remember is that compiler turns C (or another language) into assembly.
Now you have assembly code for your specific device. It’s still human readable, but very verbose. The processor cannot understand this directly, as it needs opcodes.
This is where the assembler takes over. It converts each instruction from readable text to a binary value opcode that the processor core will understand. For example,
in PIC18 assembly, the instruction RETURN 0 will become the hex value 0012.
When the assembler completes, you have a bunch of opcodes that represent your instructions. You’re almost done, but there’s one more critical step.
The linker is responsible for taking all of these opcodes and putting them in order. For example, it will put the first instruction of the program at address 0.
When writing C and assembly, you can call functions or jump to locations that you have named with a label. You do not need to know what memory address these locations
will eventually be located at. This is a very good thing, since you would need to know exactly how long each function is and calculate the offsets. The linker does this for you.
It resolves your labels into actual addresses and places opcodes at the correct offsets.
Let's say you had this pseudo assembly code:
In this case, you can figure out where
my_label goes in memory. If each opcode is one word (two bytes),
my_label will start in the second word. You could write:
Now you update your code to look like this:
Now the location of
my_label changes to the 5th word, and you would have to update the rest of your code with the new memory address. This would suck, so linkers are good.
You can also control the linker and tell it to put specific code at a specific location. Linker scripts allow you to specify where you would like code to go. Different linkers have varying syntaxes for the scripts, so you’ll have to read the manual.
Output and Programming
Once the linker is done, it will generate a file that has a map of what opcode goes at each address. This file is often converted to a format which the programming software can understand. The programmer is a hardware tool used to take the opcodes and load them into the device’s program memory. Some examples of programming tools are the USBASP for AVR microcontrollers and the Pickit3 for PIC microcontrollers.
Once the program is loaded into program memory, the device is reset and execution starts at memory address 0. Your application runs and all is well. If it isn’t, start looking where things could go wrong. Probably you made an error in your C code, but maybe the assembly code that was generated by the compiler isn’t what you expected. Maybe something is wrong with your linker settings.
Hopefully this explains a bit about how C gets from your editor to running in hardware. Feel free to leave any questions or comments.
From C to Hardware -> Original post.