Makeshift co-routines in C

Elias Bachaalany

Rate me:

1.17/5 (14 votes)

28 Jun 2007CPOL6 min read

32K

121

How threads work, and how to create threads without using any API.

Download source - 2.49 KB

Introduction

The original title of this article was "Threads without threads". In effect, I should have called it "makeshift co-routines in C" all along!

The idea of this article is to demonstrate how to make functions that can return at any point and then resume back. If you used Python or C# before, the mechanism described in the article is like the "yield" keyword.

Because a method can return intermediate result and resume back, we can actually call two or more methods in sequence thus achieving what co-routines would achieve.

Background about threads

In the Windows OS, threads are basic execution units, and an execution unit is a set of instructions that run in a certain order (or flow). A process is a host of one or more threads that are executing (seemingly) in parallel. The reason I say seemingly is because if you have a single processor machine, then only one instruction can run at a time; however, if we run an instruction belonging to a thread at time T, then another belonging to another thread at time T+1, and when done so fast, it will appear as if both threads are running at the same time.

Consider the case of a single threaded process:

C++

int main()
{
    char name[100];
    printf("Enter your name:");
    gets(name)
    printf("You entered: %s\n", name);
    return 0;
}

This program will get executed from the first instruction to the last instruction (since there are no branching) without interruptions since it is a single threaded process.

Now say, you have two methods that you want to run in "parallel"; for this, you need to create a thread for each method, and here is what generally happens:

Thread #1	Thread #2
C++ int method1() { printf("method1: instruction 1"); printf("method1: instruction 2"); printf("method1: instruction 3"); printf("method1: instruction 4"); return 0; }	C++ int method2() { printf("method2: instruction 1"); printf("method2: instruction 2"); printf("method2: instruction 3"); printf("method2: instruction 4"); printf("method2: instruction 5"); printf("method2: instruction 6"); return 0; }

Thread #1

Thread #2

C++

int method1()
{
    printf("method1: instruction 1");
    printf("method1: instruction 2");
    printf("method1: instruction 3");
    printf("method1: instruction 4");
    return 0;
}

C++

int method2()
{
    printf("method2: instruction 1");
    printf("method2: instruction 2");
    printf("method2: instruction 3");
    printf("method2: instruction 4");
    printf("method2: instruction 5");
    printf("method2: instruction 6");
    return 0;
}

The OS selects a thread to start with: it can be thread#1, thread#2, or thread #n.
The OS (based on thread priorities and other conditions) gives some time to the thread in order for it to execute some of its instructions.
The OS decides that it is time to suspend the thread and yield execution to another thread, and that generally involves:
1. Saving the current execution context: the OS needs to save the state and all the variables (registers) of the current executing thread
2. Suspends the current thread
3. Selects the next thread to resume
4. Restores that thread's previously saved state
5. Once the next thread's state is restored, the OS resumes the thread execution
Steps 2 and 3 repeat as long as there are threads running.

For the sake of illustration only, when we run thread#1 and thread#2 in parallel, an output like this may show:

method2: instruction 1
method2: instruction 2
method1: instruction 1
method2: instruction 3
method2: instruction 4
method1: instruction 2
.
.
.

Notice how the execution of thread#1 and thread#2 happens somehow in an intermixed sequence (between the two threads). If you have more than a CPU, then you can have true parallel execution since each CPU can run a thread, and there would be no need to interrupt a thread in order to run another thread.

It is important to note that execution interruption does not happen at the level of a high-level code instruction (such as C++'s cout or C's printf()); rather it happens on the machine instruction level. That means a printf() statement can translate to many machine instructions, and the printf() itself may be interrupted many times before it is fully executed.

In the coming section, I will be using the word "atomic" simply to mean the smallest instruction that we can interrupt. In the case of high-level languages, statements can be considered as atomic statements (or smallest executing instructions).

Makeshift co-routines

Now that we have introduced how multithreading works, it is time to show you how to imitate such a system.

Our main objective is to be able to run instruction #1 inside method#1, then be able to run instruction #1 in method#2, then resume execution inside method#1 at instruction #2, going back and forth between two or more methods until all methods finish executing.

To achieve this, we need to be able to save the current executing instruction # and not execute it again, and for that, we need to devise our own system of "program counter" or "instruction pointer" variable that will save at which instruction we are and a way to number each instruction (so that we simulate the addressing scheme). Let us look at this:

C++

int method1()
{
    static int pc = 1; // get's initialized once
    
    if (pc == 1)
        printf("method1: instruction 1\n");
    if (pc == 2)
        printf("method1: instruction 2\n");
    if (pc == 3)
        printf("method1: instruction 3\n");
    if (pc == 4)
        printf("method1: instruction 4\n");
    if (pc == 5)
    {
        pc = 1;
        return -1; // mark end of execution
    }   
    pc++;
    
    return 0;
}

If you look at that code, then calling method1() like that:

C++

while (method1() != -1) { }

is something very intuitive! And as you will notice, using this simple method, we managed to run an instruction each time we call the method. But I bet you feel it is not very practical for many reasons:

we need to number each instruction by hard-coding its sequential number
we can't really reset the execution of a method from the beginning
any non static variable will lose its state every time the function is called
etc...

We are not going to address all the problems; instead, we will solve problem (1) using C++ macros to automate the instruction addressing and make the code above a little elegant.

Now, instead of hard coding an instruction number, we are going to introduce a local variable that will count the instructions for us, so the code becomes something like that:

C++

int method1()
{
    static int pc = 1; // get's initialized once
    int pci = 1; // local variable that counts the instructions
    
    if (pc == pci++) // pci is ONE
        printf("method1: instruction 1\n");
    if (pc == pci++) // pci++ is TWO
        printf("method1: instruction 2\n");
    if (pc == pci++) //
        printf("method1: instruction 3\n");
    if (pc == pci++)
        printf("method1: instruction 4\n");
        
    if (pc == pci++)
    {
        pc = 1;
        return -1; // mark end of execution
    }   
    pc++;
    
    return 0;
}

That way, no more hard coding anything, and the problem is solved for us; all we need to do is create some macros:

C++

// We use it in the start of the method so that it declares
// the needed address tracking and instruction pointer tracking
#define ATOMIC_METHOD_BEGIN(name) \
  static int _counter = 1; \
  int _pci = 1;

// We use it to enclose the smallest instruction that we can interrupt
#define ATOMIC_STATEMENT(x) \
  if (_pci++ == _counter) \
  { \
    x; \
  }
  
// We use this macro to denote the end of the function.
// It returns -1 so that we know it is over with execution  
#define ATOMIC_METHOD_END \
  _counter++; \
  ATOMIC_STATEMENT(_counter = 1; return -1)

If we implement those macros, the code becomes cleaner:

C++

int method1()
{
  ATOMIC_METHOD_BEGIN(method1);

  ATOMIC_STATEMENT(printf("m1:step1\n"));
  ATOMIC_STATEMENT(printf("m1:step2\n"));
  ATOMIC_STATEMENT(printf("m1:step3\n"));
  ATOMIC_STATEMENT(printf("m1:step4\n"));

  ATOMIC_METHOD_END;
  return 0;
}

Cool, huh?! Now let us create another method and then run both methods in an intermixed manner to simulate multithreading:

C++

int method2()
{
  ATOMIC_METHOD_BEGIN(method2);

  ATOMIC_STATEMENT(printf("m2:step1\n"));
  ATOMIC_STATEMENT(printf("m2:step2\n"));
  ATOMIC_STATEMENT(printf("m2:step3\n"));
  ATOMIC_STATEMENT(printf("m2:step4\n"));

  ATOMIC_METHOD_END;
  return 0;
}

And now, let us write a small and simple thread execution dispatcher; this algorithm will call one instruction in each method until all the methods return -1:

C++

int simple_thread_controller(int count, ...)
{
  typedef int (*atomic_method_t)(void);
  va_list args;
  int i;

  atomic_method_t *methods = new atomic_method_t[count];
  int *done = new int[count];

  for (i=0;i<count;i++)
    done[i] = 0;

  va_start(args, count);

  for (i=0;i<count;i++)
    methods[i] = va_arg(args, atomic_method_t);

  va_end(args);

  do 
  {
    int exec_something = 0;
    for (i=0;i<count;i++)
    {
      // skip done statement
      if (done[i])
        continue;

      if (methods[i]() == -1)
        done[i] = 1;
      else
        exec_something = 1;
    }
    if (!exec_something)
      break;
  } while(true);

  delete [] methods;
  delete [] done;

    return 0;
}

And we test it as:

C++

int main()
{
  simple_thread_controller(2, method1, method2);
  return 0;
}

Interruptible for-loops?

These cool tricks are far from being a practical solution to imitating threads, but they really show how we can implement pseudo-threads using high-level facilities only. Now, what about control and repetition structures? The answer is that you can enclose anything inside an ATOMIC_STATEMENT macro, for example:

C++

ATOMIC_STATEMENT(for (int i=0;i<10;i++) { printf("Hello world!\n") });

However, I was not satisfied with only creating atomic statements, I wanted to create interruptible for loops that can save their states and resume them each time the method is entered or exited.

Let us define the basic structure needed to define a for-loop:

counter variable
loop start/end values
counter incrementing process

So if we are able to save those state variables, and in addition to our previous interruptible method calls, we can achieve atomic for-loops in a breath.

Here's one proposed solution:

C++

int test_method3()
{
  //ATOMIC_METHOD_BEGIN(method3);
  static int _counter = 1; 
  int _pci = 1;

  // DECLARE FOR
  static int _i, _i_counter_start;
  int _i_start = 1, _i_end = 4;

  // BEGIN FOR
  if (_pci++ == _counter)
  {
    _i_counter_start = _counter;
    _i = _i_start;
  }

  ATOMIC_STATEMENT(printf("i=%d\n", _i));

  // FOR END
  if (_pci++ == _counter)
  {
    if (_i < _i_end)
    {
      _counter = _i_counter_start;
      _i++;
    }
  }

  //ATOMIC_METHOD_END 
  if (_pci++ == _counter) 
  { 
    _counter = 1; 
    return -1; 
  }  
  
  _counter++;

  return 0;
}

Notice how we needed first to declare the for-loop state variables, then a way to mark the loop's start, and finally a way to tell whether we should loop again and to what starting point. Now, collapsing this code into elegant macros, we can get:

C++

#define ATOMIC_DECLARE_FOR(v, s, e) \
  static int v, _##v##_counter_start ; \
  int _##v##_start = s, _##v##_end = e;

#define ATOMIC_FOR_BEGIN(v) if (_pci++ == _counter) \
  { \
    _##v##_counter_start = _counter; \
    v = _##v##_start; \
  } \

#define ATOMIC_FOR_END(v) \
  ATOMIC_STATEMENT( if (v < _##v##_end) { _counter = _##v##_counter_start; v++; } )

#define ATOMIC_FOR_START(v, s, e) \
    ATOMIC_DECLARE_FOR(v, s, e); ATOMIC_FOR_BEGIN(v);

And the code becomes:

C++

int method3()
{
  ATOMIC_METHOD_BEGIN(method3);

  ATOMIC_FOR_START(i, 1, 10);
  {
      ATOMIC_STATEMENT(printf("i=%d\n", i));
  }
  ATOMIC_FOR_END(i);

  ATOMIC_METHOD_END;

  return 0;
}

You wouldn't be surprised to know that this same code allows nesting for-loops, would you?

C++

int method3()
{
  ATOMIC_METHOD_BEGIN(method3);

  ATOMIC_FOR_START(i, 1, 10);
  {
    ATOMIC_FOR_START(j, 1, i);
    {
      ATOMIC_STATEMENT(printf("*"));
    }
    ATOMIC_FOR_END(j);
    ATOMIC_STATEMENT(printf("\n"));
  }
  ATOMIC_FOR_END(i);

  ATOMIC_METHOD_END;

  return 0;
}

Can you guess what the output would be?

If we run all of the three methods through simple_thread_controller(), the following output will be produced:

m1:step1
m2:step1
m1:step2
m2:step2
m1:step3
m2:step3
*m1:step4
m2:step4

**
***
****
*****
******
*******
********
*********
**********

And if you run these same methods using Windows thread APIs, you will somehow get a similar output.

Conclusion

This article is written for educational purposes and to explore a certain idea. Hope it challenged your mind and you learned something out of it. It was highly inspired by the COMPEL scripting tool. For avid readers, I leave writing more macros and more advanced thread controllers as an exercise :)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Elias Bachaalany

Web Developer

United States

Elias (aka lallousx86, @0xeb) has always been interested in the making of things and their inner workings.

His computer interests include system programming, reverse engineering, writing libraries, tutorials and articles.

In his free time, and apart from researching, his favorite reading topics include: dreams, metaphysics, philosophy, psychology and any other human/mystical science.

Former employee of Microsoft and Hex-Rays (the creators of IDA Pro), was responsible about many debugger plugins, IDAPython project ownership and what not.

Elias currently works as an Anticheat engineer in Blizzard Entertainment.

Elias co-authored 2 books and authored one book:

- Practical Reverse Engineering
- The Antivirus Hacker's Handbook
- The Art of Batch Files Programming

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.