Click here to Skip to main content
12,950,268 members (67,687 online)
Click here to Skip to main content
Add your own
alternative version


2 bookmarked
Posted 14 Jan 2014

Test and Set, Spin locks, Mutex (Futex)

, 14 Jan 2014 CPOL
Rate this:
Please Sign up or sign in to vote.
Atomic set and test in critical sections


Using critical section and mutex related articles have been done to death, here we try a unique approach to understand the test and set x86-32 bit instruction for implementing critical sections (Spin lock part only). Critical sections (CS) are used in all modern programming paradigms where synchronization is required, though most modern day OS provide CS functionality, its internal implementation is abstracted.


You will have to know about x86 assembly and a few concepts on critical sections (, please keep in mind that this code is tested for user mode only but can be adapted to work in kernel mode with little or no modification.

Before the test and set instruction (atomic op-code BTS), we had an algorithmic approach, the more famous (and easier to understand) was Peterson's algorithm ('s_algorithm), but as with all modern based CPUs, pre-empting based on interrupt handling and instruction reordering (for weak memory models) can cause the set and test functionality to run in non-atomic and incorrect order (


Critical sections provide mutual exclusion in a multi threaded environment. Correctly implemented CS must support three criteria: mutual exclusion, progress, and bounded waiting.

  1. Mutual exclusion: 2 or more threads using the same CS object can never execute the same CS at the same  time, this must be enforced.
  2. Progress: If the CS object is not owned by any thread, then any one thread can take ownership, (not as easy as it sounds)
  3. Bounded waiting: Must enter only when CS object is up for grabs

For the above 3 points to be met, we must have an atomic Test and Set (TS) instruction.

Using the Code

The code is kept as small and reader friendly as possible, no error checks have been put in place, the reader is advised to use this code as a stepping stone to building more elaborate mutex/critical section APIs.

At the core of the critical section is an atomic operation set and test using x86 "bts" instruction.

void testAndSet(volatile UINT *m) {
 _asm { 
   mov eax,[m]
   lock bts [eax],0;  //set and test instruction
   JC Notgotmutex
   jmp ret1;
   jmp spin

 //printf("got mutex %u\n",::GetCurrentThreadId());

The "bts" instruction is used with a "lock" prefix to ensure correct operation in a multi core environment.

This instruction is responsible for copying a specified bit value to the carry flag and then set that bit value (atomic operation), we check the value of the operation via the carry flag, if the mutex was previously acquired, we spin and retest. Details of this instruction are available @

Memory fence is not required in this case since set and test takes care of everything.

The provided code attempts to synchronize 100 threads, you can use this code snippet to build more complex critical sections with reference counting (to ensure locks and unlocks are in pairs), also you can ensure that the thread does not get locked on a mutex object it currently owns.

To release the CS object (an unsigned int), we must reset the value to zero. supports recursion by using the object to keep track of the thread ID. 

I have also attached code to create a complete functioning Mutex using Futex and spin locks

Points of Interest

I hope this clears out everything related to set and test operation.


  • 14th January, 2014: Initial post


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Asif Bahrainwala
Instructor / Trainer
India India
I have been working with computers since my eight grade, programming the ZX Spectrum. I have always had an interest in assembly language and computer theory (and is still the reason for taking tons of online courses), actively code using C/C++ on Windows (using VS) and Linux (using QT).

I also provide training on data structures, algorithms, parallel patterns library , Graphics (DX11), GPGPUs (DX11-CS,AMP) and programming for performance on x86.
Feel free to call me at 0091-9823018914 (UTC +5:30)

(All views expressed here do not reflect the views of my employer).

You may also be interested in...


Comments and Discussions

QuestionCan't we do JC spin directly? Pin
Paulo Zemek14-Jan-14 8:55
professionalPaulo Zemek14-Jan-14 8:55 
AnswerRe: Can't we do JC spin directly? Pin
Asif Bahrainwala14-Jan-14 9:07
memberAsif Bahrainwala14-Jan-14 9:07 
AnswerRe: Can't we do JC spin directly? Pin
Asif Bahrainwala19-Jan-14 22:37
memberAsif Bahrainwala19-Jan-14 22:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170525.1 | Last Updated 14 Jan 2014
Article Copyright 2014 by Asif Bahrainwala
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid