Articles

How to Implement a Thread Pool in C with Pthreads: Complete Guide

Ramon Rodriguez
May 7, 2026
No Comments

Why You Need a Thread Pool in C with Pthreads

Creating and destroying threads for every small task is expensive. A thread pool in C pthreads solves this by pre-spawning a fixed number of worker threads that sit idle until work arrives, then pick up tasks from a shared queue. This pattern dramatically reduces overhead in high-throughput applications like web servers, image processors, and database engines.

In this guide, we are not building a toy example. We are building a production-quality thread pool in C using the POSIX threads (pthreads) API. Every design decision is explained, every common pitfall is addressed, and the final result is something you can drop into a real project.

By the end, you will have a fully working thread pool implementation that handles:

Worker thread lifecycle management
A thread-safe task queue with mutex and condition variable synchronization
Graceful shutdown without data races or memory leaks
Spurious wakeup protection

Architecture Overview: How a Thread Pool Works

Before writing any code, let us understand the components and how they interact.

Component	Purpose
Thread Pool Struct	Holds worker threads, the task queue, synchronization primitives, and state flags
Task Queue	A linked list (or ring buffer) of function pointers and their arguments
Mutex	Protects the task queue from concurrent access
Condition Variable	Allows worker threads to sleep until a task is available
Worker Threads	Loop forever: lock mutex, wait for task, dequeue, unlock, execute
Shutdown Flag	Signals workers to finish remaining tasks and exit cleanly

The flow is simple:

The main thread creates the pool and spawns N worker threads.
Each worker thread enters a loop, waiting on the condition variable.
When a task is submitted, it is added to the queue and one worker is signaled.
The woken worker dequeues the task, unlocks the mutex, and executes the function.
On shutdown, the flag is set, all workers are broadcast-signaled, and the main thread joins them.

Step 1: Define the Data Structures

We need two structs: one for individual tasks and one for the pool itself.

Task Structure

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>

typedef struct task {
    void (*function)(void *arg);
    void *arg;
    struct task *next;
} task_t;

Each task is a node in a singly linked list. It stores a function pointer and a void pointer to its argument. This keeps the interface generic: any function matching the signature void func(void *) can be submitted.

Thread Pool Structure

typedef struct threadpool {
    task_t *queue_head;
    task_t *queue_tail;
    int queue_size;
    
    pthread_t *threads;
    int thread_count;
    
    pthread_mutex_t queue_mutex;
    pthread_cond_t queue_cond;
    
    bool shutdown;
} threadpool_t;

Design decision: We use both queue_head and queue_tail pointers to enable O(1) enqueue at the tail and O(1) dequeue from the head. Many simple examples only use a head pointer, which forces O(n) traversal for every enqueue. That matters under load.

Step 2: Implement the Worker Thread Function

This is the core loop that every worker thread runs. Getting this right is critical.

static void *worker_function(void *arg) {
    threadpool_t *pool = (threadpool_t *)arg;
    
    while (true) {
        pthread_mutex_lock(&pool->queue_mutex);
        
        /* Wait for a task or shutdown signal.
         * IMPORTANT: use a while loop, NOT an if statement.
         * This protects against spurious wakeups. */
        while (pool->queue_size == 0 && !pool->shutdown) {
            pthread_cond_wait(&pool->queue_cond, &pool->queue_mutex);
        }
        
        /* If shutdown is requested and no tasks remain, exit */
        if (pool->shutdown && pool->queue_size == 0) {
            pthread_mutex_unlock(&pool->queue_mutex);
            pthread_exit(NULL);
        }
        
        /* Dequeue a task */
        task_t *task = pool->queue_head;
        pool->queue_head = task->next;
        if (pool->queue_head == NULL) {
            pool->queue_tail = NULL;
        }
        pool->queue_size--;
        
        pthread_mutex_unlock(&pool->queue_mutex);
        
        /* Execute the task outside the lock */
        task->function(task->arg);
        free(task);
    }
    
    return NULL;
}

Why the While Loop Matters: Spurious Wakeups

A spurious wakeup is when pthread_cond_wait returns even though no one called pthread_cond_signal or pthread_cond_broadcast. The POSIX specification explicitly allows this. If you use an if instead of while, the thread may wake up, find no task, and attempt to dequeue a NULL pointer. This is one of the most common bugs in pthreads code.

Why We Unlock Before Executing

Notice that we unlock the mutex before calling task->function(task->arg). If we held the lock during execution, only one worker could run at a time, defeating the entire purpose of a thread pool. The critical section should be as small as possible: lock, dequeue, unlock, then execute.

Step 3: Create the Thread Pool

threadpool_t *threadpool_create(int num_threads) {
    if (num_threads <= 0) {
        fprintf(stderr, "Thread count must be positive\n");
        return NULL;
    }
    
    threadpool_t *pool = (threadpool_t *)malloc(sizeof(threadpool_t));
    if (pool == NULL) {
        perror("Failed to allocate thread pool");
        return NULL;
    }
    
    pool->thread_count = num_threads;
    pool->queue_head = NULL;
    pool->queue_tail = NULL;
    pool->queue_size = 0;
    pool->shutdown = false;
    
    pool->threads = (pthread_t *)malloc(sizeof(pthread_t) * num_threads);
    if (pool->threads == NULL) {
        perror("Failed to allocate threads array");
        free(pool);
        return NULL;
    }
    
    if (pthread_mutex_init(&pool->queue_mutex, NULL) != 0) {
        perror("Mutex init failed");
        free(pool->threads);
        free(pool);
        return NULL;
    }
    
    if (pthread_cond_init(&pool->queue_cond, NULL) != 0) {
        perror("Condition variable init failed");
        pthread_mutex_destroy(&pool->queue_mutex);
        free(pool->threads);
        free(pool);
        return NULL;
    }
    
    for (int i = 0; i < num_threads; i++) {
        if (pthread_create(&pool->threads[i], NULL, worker_function, pool) != 0) {
            fprintf(stderr, "Failed to create thread %d\n", i);
            /* Clean up already created threads */
            pool->shutdown = true;
            pthread_cond_broadcast(&pool->queue_cond);
            for (int j = 0; j < i; j++) {
                pthread_join(pool->threads[j], NULL);
            }
            pthread_mutex_destroy(&pool->queue_mutex);
            pthread_cond_destroy(&pool->queue_cond);
            free(pool->threads);
            free(pool);
            return NULL;
        }
    }
    
    return pool;
}

Key point: Notice the thorough error handling. If any thread fails to create, we set the shutdown flag, broadcast to wake any already-created workers, join them, and free all resources. In production code, you cannot ignore pthread_create return values.

Step 4: Submit Tasks to the Pool

int threadpool_submit(threadpool_t *pool, void (*function)(void *), void *arg) {
    if (pool == NULL || function == NULL) {
        return -1;
    }
    
    task_t *new_task = (task_t *)malloc(sizeof(task_t));
    if (new_task == NULL) {
        perror("Failed to allocate task");
        return -1;
    }
    
    new_task->function = function;
    new_task->arg = arg;
    new_task->next = NULL;
    
    pthread_mutex_lock(&pool->queue_mutex);
    
    if (pool->shutdown) {
        pthread_mutex_unlock(&pool->queue_mutex);
        free(new_task);
        return -1;
    }
    
    if (pool->queue_tail == NULL) {
        pool->queue_head = new_task;
        pool->queue_tail = new_task;
    } else {
        pool->queue_tail->next = new_task;
        pool->queue_tail = new_task;
    }
    pool->queue_size++;
    
    pthread_cond_signal(&pool->queue_cond);
    pthread_mutex_unlock(&pool->queue_mutex);
    
    return 0;
}

Why pthread_cond_signal and not pthread_cond_broadcast? Since we are adding exactly one task, we only need to wake one worker. Broadcasting would wake all workers, causing a thundering herd where all but one immediately go back to sleep. Use signal for single-task enqueue and broadcast only during shutdown.

Step 5: Implement Graceful Shutdown

This is where many implementations fall apart. A correct shutdown must:

Stop accepting new tasks.
Let workers finish all remaining queued tasks.
Wake up all sleeping workers so they can check the shutdown flag.
Join all worker threads.
Free all memory and destroy synchronization primitives.

int threadpool_destroy(threadpool_t *pool) {
    if (pool == NULL) {
        return -1;
    }
    
    pthread_mutex_lock(&pool->queue_mutex);
    if (pool->shutdown) {
        pthread_mutex_unlock(&pool->queue_mutex);
        return -1;
    }
    pool->shutdown = true;
    pthread_cond_broadcast(&pool->queue_cond);
    pthread_mutex_unlock(&pool->queue_mutex);
    
    /* Join all worker threads */
    for (int i = 0; i < pool->thread_count; i++) {
        pthread_join(pool->threads[i], NULL);
    }
    
    /* Free any remaining tasks in the queue */
    task_t *current = pool->queue_head;
    while (current != NULL) {
        task_t *next = current->next;
        free(current);
        current = next;
    }
    
    pthread_mutex_destroy(&pool->queue_mutex);
    pthread_cond_destroy(&pool->queue_cond);
    free(pool->threads);
    free(pool);
    
    return 0;
}

Important: We use pthread_cond_broadcast here, not pthread_cond_signal. During shutdown we need to wake all sleeping workers, not just one. If any worker stays asleep, pthread_join will block forever and your program will hang.

Step 6: Putting It All Together

Here is a complete example that you can compile and run.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

/* Include all the code from steps 1-5 above, then: */

void example_task(void *arg) {
    int id = *(int *)arg;
    printf("Task %d is running on thread %lu\n", id, (unsigned long)pthread_self());
    usleep(100000); /* Simulate work: 100ms */
    free(arg);
}

int main(void) {
    int num_threads = 4;
    int num_tasks = 20;
    
    threadpool_t *pool = threadpool_create(num_threads);
    if (pool == NULL) {
        fprintf(stderr, "Failed to create thread pool\n");
        return EXIT_FAILURE;
    }
    
    printf("Thread pool created with %d workers\n", num_threads);
    
    for (int i = 0; i < num_tasks; i++) {
        int *task_id = (int *)malloc(sizeof(int));
        if (task_id == NULL) {
            perror("malloc");
            continue;
        }
        *task_id = i;
        threadpool_submit(pool, example_task, task_id);
    }
    
    printf("All tasks submitted. Shutting down...\n");
    sleep(1); /* Give workers time to process */
    
    threadpool_destroy(pool);
    printf("Thread pool destroyed. All done.\n");
    
    return EXIT_SUCCESS;
}

Compile and run:

gcc -o threadpool main.c -pthread -Wall -Wextra
./threadpool

You should see the 20 tasks distributed across 4 worker threads.

Common Pitfalls and How to Avoid Them

Building a thread pool in C with pthreads looks straightforward, but subtle bugs can lurk for months before surfacing. Here are the most common mistakes.

Pitfall	Symptom	Fix
Using `if` instead of `while` for condition wait	Segfault or NULL dereference on spurious wakeup	Always use `while (condition) { pthread_cond_wait(...); }`
Holding the lock during task execution	Only one thread runs at a time; no parallelism	Unlock mutex before calling `task->function`
Using `signal` instead of `broadcast` at shutdown	Program hangs during `pthread_join`	Use `pthread_cond_broadcast` when shutting down
Forgetting to free task memory	Memory leaks under sustained load	Free the task struct after execution; free remaining tasks on destroy
Not checking `pthread_create` return value	Silent failure; fewer workers than expected	Always check return values and handle partial creation
Task argument lifetime issues	Use-after-free or stale data	Heap-allocate arguments; let the task function free them
Calling `pthread_mutex_destroy` while threads still reference it	Undefined behavior	Always `pthread_join` all threads before destroying the mutex

Production Enhancements

The implementation above is solid and correct, but real-world projects often need a few more features. Here are some enhancements to consider.

1. Bounded Queue with Backpressure

Add a maximum queue size and a second condition variable. When the queue is full, threadpool_submit blocks until a worker dequeues a task. This prevents unbounded memory growth when producers are faster than consumers.

/* In threadpool_submit, after locking: */
while (pool->queue_size >= pool->max_queue_size && !pool->shutdown) {
    pthread_cond_wait(&pool->queue_not_full, &pool->queue_mutex);
}

/* In worker_function, after dequeueing: */
pthread_cond_signal(&pool->queue_not_full);

2. Wait for All Tasks to Complete

Add a threadpool_wait function that blocks until the queue is empty and all workers are idle. Use an active-task counter and another condition variable.

3. Dynamic Thread Count

Monitor queue depth. If it exceeds a threshold, spawn additional workers up to a maximum. If workers are idle too long, let excess ones exit. This is how Apache and Nginx manage their worker pools.

4. Task Priorities

Replace the linked list with a priority queue (heap). Higher-priority tasks get dequeued first.

5. Thread-Local Storage

If workers need per-thread resources (database connections, buffers), use pthread_key_create with a destructor to automatically clean up when the thread exits.

Performance Considerations

Choosing the right number of threads is critical. Here is a rule of thumb:

CPU-bound tasks: Set thread count equal to the number of CPU cores. More threads cause contention with no benefit.
I/O-bound tasks: Use more threads than cores (2x to 10x) because threads spend most of their time waiting.
Mixed workloads: Profile and benchmark. There is no universal formula.

Also keep in mind:

Minimize the time spent holding the mutex. Every nanosecond you hold it is a nanosecond another thread is blocked.
Consider using a lock-free queue if profiling shows contention on the mutex. However, lock-free data structures are extremely difficult to implement correctly in C. Start with mutexes and only optimize if needed.
Use valgrind --tool=helgrind and valgrind --tool=drd to detect data races during development.

Complete Source Code: Single Header File

For convenience, here is the full implementation in a single-file format that you can include in your project. Save it as threadpool.h and threadpool.c, or adapt it to your build system.

The complete source code with all functions from this tutorial is available above in steps 1 through 5. Copy the structs, the worker function, threadpool_create, threadpool_submit, and threadpool_destroy into your source files, and you have a working thread pool.

For a ready-made open source version, check out the C-Thread-Pool project on GitHub by Pithikos, which follows a similar architecture.

Frequently Asked Questions

What is a thread pool in C pthreads?

A thread pool is a group of pre-created POSIX threads that wait for tasks to be submitted to a shared queue. Instead of creating a new thread for each task (which is expensive), the pool reuses existing threads. Tasks are enqueued by the producer, and idle worker threads dequeue and execute them. Synchronization is handled with pthread_mutex_t and pthread_cond_t.

How many threads should I use in my thread pool?

For CPU-bound workloads, use a number equal to the CPU core count. For I/O-bound workloads, you can safely use 2 to 10 times the core count because threads spend most of their time blocked on I/O. Profile your specific application to find the optimal number.

What is a spurious wakeup and why does it matter?

A spurious wakeup occurs when pthread_cond_wait returns even though no signal or broadcast was sent. The POSIX standard explicitly permits this behavior. If your code assumes that waking up means a task is available, it will crash or misbehave. Always re-check the condition in a while loop after waking.

Can I use this thread pool on Windows?

Not directly. The code uses POSIX threads, which are native to Linux, macOS, and other Unix-like systems. On Windows, you can use a pthreads compatibility layer like pthreads-win32, or rewrite the synchronization using Windows native APIs (CreateThread, CRITICAL_SECTION, CONDITION_VARIABLE).

How do I handle errors inside task functions?

The thread pool itself does not handle errors from tasks. You should design your task functions to catch and log their own errors. If you need to communicate results or errors back to the submitter, use a shared results structure protected by its own mutex, or use a callback mechanism.

Is it safe to submit tasks from multiple threads simultaneously?

Yes. The threadpool_submit function locks the mutex before modifying the queue, so it is safe to call from any number of threads concurrently.

How do I compile C code with pthreads?

On GCC and Clang, add the -pthread flag to both compile and link commands:

gcc -o myprogram myprogram.c -pthread

This flag defines the necessary macros and links the pthreads library automatically.