Click here to Skip to main content
Click here to Skip to main content

Threads without threads

, 28 Jun 2007
Rate this:
Please Sign up or sign in to vote.
How threads work, and how to create threads without using any API.

Introduction

Let me start by talking about this paradoxical title: "Threads" without threads. I have deliberately chosen such a confusing article title so that you don't think something like "Oh, I already know how to create threads in Windows and .NET, why would I want to read this article?!". This article is totally different, and does not teach you how to use APIs to create threads; instead, it will attempt to show you how the Operating System achieves multithreading with the help of the CPU, and how to create your own "pseudo" (if I may call it so) thread management APIs.

Already excited and want to know how? Well, hang on a bit until we talk a little about threads and how the Operating System achieves multithreading.

Background

In the Windows OS, threads are basic execution units, and an execution unit is a set of instructions that run in a certain order (or flow). A process is a host of one or more threads that are executing (seemingly) in parallel. The reason I say seemingly is because if you have a single processor machine, then only one instruction can run at a time; however, if we run an instruction belonging to a thread at time T, then another belonging to another thread at time T+1, and when done so fast, it will appear as if both threads are running at the same time.

Consider the case of a single threaded process:

int main()
{
    char name[100];
    printf("Enter your name:");
    gets(name)
    printf("You entered: %s\n", name);
    return 0;
}

This program will get executed from the first instruction to the last instruction (since there are no branching) without interruptions since it is a single threaded process.

Now say, you have two methods that you want to run in "parallel"; for this, you need to create a thread for each method, and here is what generally happens:

Thread #1 Thread #2
int method1()
{
    printf("method1: instruction 1");
    printf("method1: instruction 2");
    printf("method1: instruction 3");
    printf("method1: instruction 4");
    return 0;
}


int method2()
{
    printf("method2: instruction 1");
    printf("method2: instruction 2");
    printf("method2: instruction 3");
    printf("method2: instruction 4");
    printf("method2: instruction 5");
    printf("method2: instruction 6");
    return 0;
}
  1. The OS selects a thread to start with: it can be thread#1, thread#2, or thread #n.
  2. The OS (based on thread priorities and other conditions) gives some time to the thread in order for it to execute some of its instructions.
  3. The OS decides that it is time to suspend the thread and yield execution to another thread, and that generally involves:
    1. Saving the current execution context: the OS needs to save the state and all the variables (registers) of the current executing thread
    2. Suspends the current thread
    3. Selects the next thread to resume
    4. Restores that thread's previously saved state
    5. Once the next thread's state is restored, the OS resumes the thread execution
  4. Steps 2 and 3 repeat as long as there are threads running.

For the sake of illustration only, when we run thread#1 and thread#2 in parallel, an output like this may show:

method2: instruction 1
method2: instruction 2
method1: instruction 1
method2: instruction 3
method2: instruction 4
method1: instruction 2
.
.
.

Notice how the execution of thread#1 and thread#2 happens somehow in an intermixed sequence (between the two threads). If you have more than a CPU, then you can have true parallel execution since each CPU can run a thread, and there would be no need to interrupt a thread in order to run another thread.

It is important to note that execution interruption does not happen at the level of a high-level code instruction (such as C++'s cout or C's printf()); rather it happens on the machine instruction level. That means a printf() statement can translate to many machine instructions, and the printf() itself may be interrupted many times before it is fully executed.

In the coming section, I will be using the word "atomic" simply to mean the smallest instruction that we can interrupt. In the case of high-level languages, statements can be considered as atomic statements (or smallest executing instructions).

Emulating multithreading

Now that we have introduced how multithreading works, it is time to show you how to imitate such a system.

Our main objective is to be able to run instruction #1 inside method#1, then be able to run instruction #1 in method#2, then resume execution inside method#1 at instruction #2, going back and forth between two or more methods until all methods finish executing.

To achieve this, we need to be able to save the current executing instruction # and not execute it again, and for that, we need to devise our own system of "program counter" or "instruction pointer" variable that will save at which instruction we are and a way to number each instruction (so that we simulate the addressing scheme). Let us look at this:

int method1()
{
    static int pc = 1; // get's initialized once
    
    if (pc == 1)
        printf("method1: instruction 1\n");
    if (pc == 2)
        printf("method1: instruction 2\n");
    if (pc == 3)
        printf("method1: instruction 3\n");
    if (pc == 4)
        printf("method1: instruction 4\n");
    if (pc == 5)
    {
        pc = 1;
        return -1; // mark end of execution
    }   
    pc++;
    
    return 0;
}

If you look at that code, then calling method1() like that:

while (method1() != -1) { }

is something very intuitive! And as you will notice, using this simple method, we managed to run an instruction each time we call the method. But I bet you feel it is not very practical for many reasons:

  1. we need to number each instruction by hard-coding its sequential number
  2. we can't really reset the execution of a method from the beginning
  3. any non static variable will lose its state every time the function is called
  4. etc...

We are not going to address all the problems; instead, we will solve problem (1) using C++ macros to automate the instruction addressing and make the code above a little elegant.

Now, instead of hard coding an instruction number, we are going to introduce a local variable that will count the instructions for us, so the code becomes something like that:

int method1()
{
    static int pc = 1; // get's initialized once
    int pci = 1; // local variable that counts the instructions
    
    if (pc == pci++) // pci is ONE
        printf("method1: instruction 1\n");
    if (pc == pci++) // pci++ is TWO
        printf("method1: instruction 2\n");
    if (pc == pci++) //
        printf("method1: instruction 3\n");
    if (pc == pci++)
        printf("method1: instruction 4\n");
        
    if (pc == pci++)
    {
        pc = 1;
        return -1; // mark end of execution
    }   
    pc++;
    
    return 0;
}

That way, no more hard coding anything, and the problem is solved for us; all we need to do is create some macros:

// We use it in the start of the method so that it declares
// the needed address tracking and instruction pointer tracking
#define ATOMIC_METHOD_BEGIN(name) \
  static int _counter = 1; \
  int _pci = 1;

// We use it to enclose the smallest instruction that we can interrupt
#define ATOMIC_STATEMENT(x) \
  if (_pci++ == _counter) \
  { \
    x; \
  }
  
// We use this macro to denote the end of the function.
// It returns -1 so that we know it is over with execution  
#define ATOMIC_METHOD_END \
  _counter++; \
  ATOMIC_STATEMENT(_counter = 1; return -1)

If we implement those macros, the code becomes cleaner:

int method1()
{
  ATOMIC_METHOD_BEGIN(method1);

  ATOMIC_STATEMENT(printf("m1:step1\n"));
  ATOMIC_STATEMENT(printf("m1:step2\n"));
  ATOMIC_STATEMENT(printf("m1:step3\n"));
  ATOMIC_STATEMENT(printf("m1:step4\n"));

  ATOMIC_METHOD_END;
  return 0;
}

Cool, huh?! Now let us create another method and then run both methods in an intermixed manner to simulate multithreading:

int method2()
{
  ATOMIC_METHOD_BEGIN(method2);

  ATOMIC_STATEMENT(printf("m2:step1\n"));
  ATOMIC_STATEMENT(printf("m2:step2\n"));
  ATOMIC_STATEMENT(printf("m2:step3\n"));
  ATOMIC_STATEMENT(printf("m2:step4\n"));

  ATOMIC_METHOD_END;
  return 0;
}

And now, let us write a small and simple thread execution dispatcher; this algorithm will call one instruction in each method until all the methods return -1:

int simple_thread_controller(int count, ...)
{
  typedef int (*atomic_method_t)(void);
  va_list args;
  int i;

  atomic_method_t *methods = new atomic_method_t[count];
  int *done = new int[count];

  for (i=0;i<count;i++)
    done[i] = 0;

  va_start(args, count);

  for (i=0;i<count;i++)
    methods[i] = va_arg(args, atomic_method_t);

  va_end(args);

  do 
  {
    int exec_something = 0;
    for (i=0;i<count;i++)
    {
      // skip done statement
      if (done[i])
        continue;

      if (methods[i]() == -1)
        done[i] = 1;
      else
        exec_something = 1;
    }
    if (!exec_something)
      break;
  } while(true);

  delete [] methods;
  delete [] done;

    return 0;
}

And we test it as:

int main()
{
  simple_thread_controller(2, method1, method2);
  return 0;
}

Interruptible for-loops?

These cool tricks are far from being a practical solution to imitating threads, but they really show how we can implement pseudo-threads using high-level facilities only. Now, what about control and repetition structures? The answer is that you can enclose anything inside an ATOMIC_STATEMENT macro, for example:

ATOMIC_STATEMENT(for (int i=0;i<10;i++) { printf("Hello world!\n") });

However, I was not satisfied with only creating atomic statements, I wanted to create interruptible for loops that can save their states and resume them each time the method is entered or exited.

Let us define the basic structure needed to define a for-loop:

  1. counter variable
  2. loop start/end values
  3. counter incrementing process

So if we are able to save those state variables, and in addition to our previous interruptible method calls, we can achieve atomic for-loops in a breath.

Here's one proposed solution:

int test_method3()
{
  //ATOMIC_METHOD_BEGIN(method3);
  static int _counter = 1; 
  int _pci = 1;

  // DECLARE FOR
  static int _i, _i_counter_start;
  int _i_start = 1, _i_end = 4;

  // BEGIN FOR
  if (_pci++ == _counter)
  {
    _i_counter_start = _counter;
    _i = _i_start;
  }

  ATOMIC_STATEMENT(printf("i=%d\n", _i));

  // FOR END
  if (_pci++ == _counter)
  {
    if (_i < _i_end)
    {
      _counter = _i_counter_start;
      _i++;
    }
  }

  //ATOMIC_METHOD_END 
  if (_pci++ == _counter) 
  { 
    _counter = 1; 
    return -1; 
  }  
  
  _counter++;

  return 0;
}

Notice how we needed first to declare the for-loop state variables, then a way to mark the loop's start, and finally a way to tell whether we should loop again and to what starting point. Now, collapsing this code into elegant macros, we can get:

#define ATOMIC_DECLARE_FOR(v, s, e) \
  static int v, _##v##_counter_start ; \
  int _##v##_start = s, _##v##_end = e;

#define ATOMIC_FOR_BEGIN(v) if (_pci++ == _counter) \
  { \
    _##v##_counter_start = _counter; \
    v = _##v##_start; \
  } \

#define ATOMIC_FOR_END(v) \
  ATOMIC_STATEMENT( if (v < _##v##_end) { _counter = _##v##_counter_start; v++; } )

#define ATOMIC_FOR_START(v, s, e) \
    ATOMIC_DECLARE_FOR(v, s, e); ATOMIC_FOR_BEGIN(v);

And the code becomes:

int method3()
{
  ATOMIC_METHOD_BEGIN(method3);

  ATOMIC_FOR_START(i, 1, 10);
  {
      ATOMIC_STATEMENT(printf("i=%d\n", i));
  }
  ATOMIC_FOR_END(i);

  ATOMIC_METHOD_END;

  return 0;
}

You wouldn't be surprised to know that this same code allows nesting for-loops, would you?

int method3()
{
  ATOMIC_METHOD_BEGIN(method3);

  ATOMIC_FOR_START(i, 1, 10);
  {
    ATOMIC_FOR_START(j, 1, i);
    {
      ATOMIC_STATEMENT(printf("*"));
    }
    ATOMIC_FOR_END(j);
    ATOMIC_STATEMENT(printf("\n"));
  }
  ATOMIC_FOR_END(i);

  ATOMIC_METHOD_END;

  return 0;
}

Can you guess what the output would be?

If we run all of the three methods through simple_thread_controller(), the following output will be produced:

m1:step1
m2:step1
m1:step2
m2:step2
m1:step3
m2:step3
*m1:step4
m2:step4

**
***
****
*****
******
*******
********
*********
**********

And if you run these same methods using Windows thread APIs, you will somehow get a similar output.

Conclusion

This article is written for educational purposes and to explore a certain idea. Hope it challenged your mind and you learned something out of it. It was highly inspired by the COMPEL scripting tool. For avid readers, I leave writing more macros and more advanced thread controllers as an exercise Smile | :)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Elias Bachaalany
Web Developer
United States United States
Elias (aka lallousx86, @0xeb) has always been interested in the making of things and their inner workings.
 
His computer interests include system programming, reverse engineering, writing libraries, tutorials and articles.
 
In his free time, and apart from researching, his favorite reading topics include: dreams, metaphysics, philosophy, psychology and any other human/mystical science.
 
Former employee of Hex-Rays (the creators of IDA Pro), was responsible about many debugger plugins, IDAPython project ownership and what not.
 
Elias currently works at Microsoft as a software security engineer.
 
More articles and blog posts can be found here:
 
- http://lallousx86.wordpress.com/
- http://0xeb.wordpress.com/
- http://www.hexblog.com/?author=3

Comments and Discussions

 
GeneralRating this article Pinmemberlallous1-Jul-07 21:35 
GeneralRe: Rating this article PinmemberMyronLi4-Sep-07 17:36 
GeneralRe: Rating this article Pinmemberlallous5-Sep-07 1:31 
GeneralRe: Rating this article - it's not useful PinmemberRenniePet29-Jan-08 10:42 
GeneralRe: Rating this article - it's not useful Pinmemberlallous29-Jan-08 20:43 
Hi,
 
I had no idea that I should clarify that this was never meant to substitute threading facilities provided by the OS, I thought that readers are decent enough to deduct the real purpose of this article.
 
On the other hand, you said "Not useful". It depends on what angle you're viewing it.
 
If you want to write your own simple script interpreter or VM for instance, this article can inspire you.
If you ever asked yourself how to do something like threads, how to save contexts, interrupt a function and resume it, then this article is for you.
 
If you are teaching beginners about threads and want to show them in simple terms how can "parallelism" be achieved, then this article for you.
 

On the other hand, if you want to stop using CreateThread() API, then naturally, this is not for you Wink | ;)
 
Now tell me, is it useful again? Wink | ;)
GeneralRe: Rating this article - it's not useful - you did ask for feedback PinmemberRenniePet29-Jan-08 23:03 
GeneralRe: Rating this article - it's not useful - you did ask for feedback Pinmemberlallous30-Jan-08 0:47 
GeneralRe: Rating this article - it's not useful - you did ask for feedback PinmemberBill Hallahan23-Nov-09 6:33 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140721.1 | Last Updated 28 Jun 2007
Article Copyright 2007 by Elias Bachaalany
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid