Race Conditions & Semaphores : Handling 500+ Concurrent Users

1. The Core Problem: How Does a Race Condition Occur?

Imagine you have exactly 50 discount coupons left in your database. When a user clicks “Apply,” the backend server typically executes code in three steps:

  1. Read: Check the database for remaining coupons. (e.g., Coupons = 50)

  2. Check: Are the remaining coupons > 0? If yes, proceed.

  3. Update: Subtract 1 from the total and grant the discount. (e.g., Coupons = 49)

High Traffic Scenario (Without Synchronization):

Suddenly, 500 users click “Apply” at the exact same millisecond.

  • Your server spins up separate threads (or processes) for all these requests simultaneously.

  • Thread 1 reads the database: Coupons = 50. It moves to step 2.

  • In that exact same nanosecond, Thread 2, Thread 3, and 70 other threads also read the database. They all read Coupons = 50!

  • All of them pass the check (50 > 0) and proceed to update the database (50 - 1 = 49).

  • The Result: 70 people get the discount, but the database count might show 49, or worse, it might decrement into negative numbers (e.g., -20). This is a Race Condition, and it results in a financial loss for the company.


2. The Solution: How Does a Counting Semaphore Work?

The Operating System provides a concept called a Semaphore to solve this. At a basic level, a Semaphore is a special integer variable managed by the OS. The key feature of a Semaphore is that operations performed on it are “Atomic” (they cannot be interrupted halfway through).

At a basic level, a Semaphore is a special integer variable managed by the OS. The key feature of a Semaphore is that operations performed on it are "Atomic" (they cannot be interrupted halfway through).

For this scenario, we use a Counting Semaphore.

  • Initialization: We initialize the Semaphore (let’s call it $S$) to 50. ($S = 50$).

  • The Wait / P() Operation: Before any request can process the discount, it must pass through the Semaphore.

  • The rule of the Semaphore is:

    • If $S > 0$, I will immediately decrement it by 1 and allow the process to continue.

    • If $S == 0$, I will block (suspend) this process and return an error (“Coupons exhausted!”).

High Traffic Scenario (With Semaphore):

When 500 users click simultaneously:

  • The OS forces them into a queue.

  • The Semaphore processes them sequentially (one after another, in microseconds).

  • The first 50 requests will take the Semaphore from $S = 50$ down to $S = 0$.

  • When the 51st user’s request arrives, the Semaphore sees $S == 0$. It blocks that request, preventing it from executing the discount code.

  • The Result: Exactly 50 people get the discount—not a single person more.

Our portfolio Makeuser


3. OS Kernel Level: Wait Queues and PCBs (Process Control Blocks)

At an advanced/architectural level, a Semaphore is not just an integer. Inside the OS Kernel, it is a Data Structure (Struct) containing two things:

At an advanced/architectural level, a Semaphore is not just an integer. Inside the OS Kernel, it is a Data Structure (Struct) containing two things

  1. An integer value (the count).

  2. A pointer to a Wait Queue (a Linked List of blocked processes).

When $S == 0$ and the 51st user requests a coupon, the OS doesn’t just make it “wait” in a loop. It does the following:

  • The OS takes the PCB (Process Control Block) of that specific thread. (The PCB stores the current state, memory limits, and CPU registers of a process).

  • The OS moves this PCB out of the CPU’s “Ready Queue” and puts it into the Semaphore’s Wait Queue.

  • The OS then executes a block() system call. The thread goes to sleep and consumes absolutely zero CPU cycles.

  • (In a Mutex scenario, when another process releases a lock via a signal or V() operation, the OS uses a wakeup() system call to bring a waiting process back to the CPU. This swap is called a Context Switch).


4. Hardware Level: CPU Atomicity (The Root Cause)

A major question arises: If 2 threads try to subtract 1 from the Semaphore at the exact same time, why doesn’t the Semaphore itself suffer from a race condition?

Because the Semaphore is protected by the Hardware (CPU) itself. OS developers rely on special CPU instructions (built into architectures like x86 or ARM):

  • Test-and-Set Lock (TSL): A CPU instruction that performs a “Read and Write” in one single, uninterruptible clock cycle.

  • Compare-And-Swap (CAS): Used in modern processors (where your Node.js or PHP servers run). It tells the CPU: “Only change this memory value to $Y$ if it is currently exactly $X$; otherwise, fail.”

Ultimately, software-level locks exist because the silicon hardware guarantees atomic operations.


5. Advanced OS Problems: Priority Inversion

When using locks (like Mutexes or Semaphores initialized to 1), a dangerous situation called Priority Inversion can occur.

When using locks (like Mutexes or Semaphores initialized to 1), a dangerous situation called Priority Inversion can occur

  • The Problem: Suppose you have High ($H$), Medium ($M$), and Low ($L$) priority processes.

  • $L$ acquires the lock. $H$ needs the lock, so it gets blocked and goes into the Wait Queue.

  • Suddenly, $M$ becomes ready to run. Because $M$ has a higher priority than $L$, the OS scheduler preempts (pauses) $L$ and gives the CPU to $M$.

  • Now, $L$ cannot release the lock because it doesn’t have the CPU. And $H$ (the most critical process) is stuck waiting forever.

  • The Solution: Operating Systems use the Priority Inheritance Protocol. The OS temporarily boosts $L$‘s priority to match $H$‘s, ensuring $L$ finishes its task quickly, releases the lock, and allows $H$ to run.


6. Modern System Architecture: Distributed Semaphores

As a Full-Stack developer, you must know that OS-level Semaphores only protect you if your backend is running on a single server.

As a Full-Stack developer, you must know that OS-level Semaphores only protect you if your backend is running on a single server

If your marketplace scales and you are using Load Balancing across 3 different servers (e.g., on AWS or Vercel), an OS-level lock on Server A cannot stop a thread running on Server B!

The Advanced Solution:

In modern web architecture, we move the synchronization out of the OS and into the database or caching layer using Distributed Locks:

  1. Redis (Redlock Algorithm): We use an in-memory datastore like Redis as the central Semaphore. All 3 servers must ask Redis for the lock before touching the database. Because Redis is single-threaded, it naturally processes requests atomically.

  2. Optimistic Concurrency Control (OCC) in DB: We add a version column to our database row. The SQL query becomes:

    UPDATE coupons SET count = count - 1, version = version + 1 WHERE id = 1 AND version = 5;

    If two servers try to update simultaneously, the first one changes the version to 6. The second server’s query will safely fail because version = 5 no longer exists, preventing the race condition without requiring heavy wait queues.

Leave a Comment