3 - Assembly Representation of Race Condition and Mutual Exclusion

Practice Program

#include <stdio.h>
#include <pthread.h>
 
/**
 * Here, the loop is running 10_000_000 times and increments the counter. This happens twice. It is natural to
 * except the value of counter would be 20_000_000 but that is not the case. The output of this program is
 * Undeterministic. Every time you run the program, it might yield a different output.
 *
 * Here, `counter` is the shared variable across two threads which are incrementing it parallely.
 * This leads to issues and can be understood when looked at its assembly instructions.
 *
 * Variable `counter` is static and global. It would be stored in the .bss section. Let's assume the address of
 * `counter` variable is 0x8049a1c. Now, let's look at the assembly instructions for `counter++` at line 16. This
 * is where things go haywire.
 *
 * I1: MOV eax, 0x8049a1c
 * I2: ADD eax, 0x1
 * I3: MOV 0x8049a1c, eax
 *
 * Assume the value of counter (0x8049a1c) is 90. Now, suppose thread T1 has executed upto instruction I2 and scheduler
 * preempts the thread saving the state of the `eax` register in the Thread Control Block (TCB, same as PCB but for
 * threads). In TCB, the value of eax is 91 and counter is 90. Memory location 0x8049a1c (counter) has not yet started
 * pointing to the updated value of 91.
 *
 * After preemption of thread T1, thread T2 starts executing and reads the value of counter as 90 and completes all the
 * three instructions (I1, I2, and I3). Now, the value of eax and counter, under the context of thread T2, is 91 and 91
 * respectively. Thread T2 completes it's work and exits.
 *
 * Now, scheduler decides to continue the Thread T1's work. Thread T1's context gets restored through its Thread
 * Control Block. Here, the value of register `eax` and variable counter is 91 and 90 respectively and the Program
 * Counter is pointing to the instruction I3 since I2 was executed. Here, I3 saves the value of register eax in
 * the counter (0x8049a1c) which is 91.
 *
 * Voila! You can see that here we saved the value 91 twice in the counter variable. This is a classic example of *(Race
 * Condition**.
 *
 * This happened due to the lack of atomicity of instructions. If all three instructions could be executed as one
 * single instruction, this problem would get resolved. This is where we make use of locks.
 *
 * `counter++` is the Critical Section of this program.  A critical section is a piece of code that accesses a shared
 * variable or resource  and must not be concurrently executed by more than one thread. What we really want in this
 * code is **Mutual Exclusion**.
 */
 
static int counter = 0;
 
void* increment() {
  for (int i = 0; i < 1e7; i++) {
    counter++;
  }
 
  return NULL;
}
 
int main() {
  pthread_t thread1;
  pthread_t thread2;
 
  int status = pthread_create(&thread1, NULL, increment, NULL);
 
  if (status) {
    printf("Failed to create the thread");
  }
 
  status = pthread_create(&thread2, NULL, increment, NULL);
  if (status) {
    printf("Failed to create the thread");
  }
 
  status = pthread_join(thread1, NULL);
  if (status) {
    printf("Failed to join the thread");
  }
 
  status = pthread_join(thread2, NULL);
  if (status) {
    printf("Failed to join the thread");
  }
 
  printf("Count is: %d\n", counter);
}

1. Program Overview

This program creates two threads that increment a shared variable counter.

Each thread performs:

10,000,000 increments

Since two threads execute the same function, the expected result would be:

20,000,000

However, the program produces different outputs on different runs.

Example outputs might be:

Count is: 13847291
Count is: 19753122
Count is: 16488234

This behavior is non-deterministic. The cause is a race condition.

2. Shared Variables Between Threads

The variable

static int counter = 0;

is:

  • Global
  • Shared across all threads
  • Stored in the .bss section of the program’s memory.

Because both threads access and modify this variable simultaneously, they may interfere with each other's operations.

3. The counter++ Operation Is Not Atomic

At the source level, counter++ looks like a single instruction. However, at the machine level it expands into multiple instructions.

Example assembly representation:

I1: MOV eax, 0x8049a1c
I2: ADD eax, 1
I3: MOV 0x8049a1c, eax

Steps performed:

  1. Load the value of counter into register eax
  2. Increment the register
  3. Store the value back to memory

Because these steps are separate instructions, thread scheduling can interrupt them.

4. How the Race Condition Happens

Assume:

counter = 90

Thread T1 starts execution

I1: eax = counter   → eax = 90
I2: eax = eax + 1   → eax = 91

Before executing I3, the scheduler preempts the thread.

Current state:

T1: eax = 91
counter = 90

Thread T2 executes

Thread T2 reads the same value.

I1: eax = counter   → eax = 90
I2: eax = eax + 1   → eax = 91
I3: counter = eax   → counter = 91

Thread T2 finishes.

Thread T1 resumes

The scheduler restores the thread's state from its Thread Control Block (TCB).

eax = 91
counter = 90
PC → I3

Now instruction I3 executes:

counter = eax → 91

Final result

Instead of two increments:

90 → 91 → 92

we get:

90 → 91

The increment was lost.

5. Race Condition

This scenario is known as a Race Condition.

A race condition occurs when:

  • Multiple threads access shared data
  • At least one thread modifies the data
  • Execution order affects the final result

Because thread scheduling is unpredictable, the program produces non-deterministic results.

6. Critical Section

The problematic part of the code is:

counter++;

This is called a Critical Section.

A critical section is a piece of code that:

  • Accesses shared resources
  • Must not be executed by multiple threads simultaneously

7. Mutual Exclusion

To fix the problem, we must enforce Mutual Exclusion.

Mutual exclusion ensures that:

Only one thread can execute the critical section at a time

This is typically implemented using:

  • Mutex locks
  • Spinlocks
  • Atomic operations
  • Semaphores

8. Why the Output Is Non-Deterministic

Thread scheduling depends on:

  • CPU scheduling decisions
  • Interrupts
  • OS timing
  • Hardware concurrency

Therefore the interleaving of instructions changes every run. This leads to different final values of counter.