s***@gmail.com
2008-07-15 01:01:55 UTC
Hi,
I am working on a designing a Thread Manager to run n number of
Worker threads in batch mode.
The Manager runs continuously and runs next n number of Worker
threads when all threads in previous batch return.
I am using condition variable and signals to synchronize the whole
project.
My design:-
Manager:-
int worker_count= 0
pthread_mutex_lock(worker_mutex)
pthread_create -> all_workers in batch(worker_count updated to n)
worker_stopped = 0
for(;;)
{
while(worker_stopped == 0)
pthread_cond_wait(worker_mutex, worker_stop_cond)
pthread_join(worker_id, arg)
worker_count--
if(worker_count == 0)
pthread_create -> all_workers in next batch(worker_count
updated to n)
worker_stopped=0
}
pthread_mutex_unlock(worker_mutex)
Worker "i" :-
Do some work
pthread_mutex_lock(worker_mutex)
worker_stopped = pthread_self()
pthread_cond_signal(worker_stop_cond)
pthread_mutex_unlock(worker_mutex)
pthread_exit(arg)
The problem is I am loosing some signals which leads to a deadlock(The
manager just hangs in waiting condition even after all workers have
returned).
I debugged the program and it looks like the problem occurs if say
worker "B" locks the "worker_mutex" after worker "A" updated
"worker_stopped" and unlocked "worker_mutex" but before Manager could
lock the "worker_mutex"
I looks like a race condition between manager and other set of stopped
worker to grab mutex after one worker released the mutex.
One solution I could think of it to make the stopped worker signal
only when "worker_stopped" is 0 but is there a better solution to the
problem.
I am not sure if my analyzes is correct and wanted to take opinion of
the community.
I also wanted to know is it "OK" to lock the mutex before launching
the threads. I am doing it so that no thread exits and signals before
the manager calls pthread_cond_wait. Is there a better way to
implement this sync.
Looking forward for some helping solutions.
Thanks
I am working on a designing a Thread Manager to run n number of
Worker threads in batch mode.
The Manager runs continuously and runs next n number of Worker
threads when all threads in previous batch return.
I am using condition variable and signals to synchronize the whole
project.
My design:-
Manager:-
int worker_count= 0
pthread_mutex_lock(worker_mutex)
pthread_create -> all_workers in batch(worker_count updated to n)
worker_stopped = 0
for(;;)
{
while(worker_stopped == 0)
pthread_cond_wait(worker_mutex, worker_stop_cond)
pthread_join(worker_id, arg)
worker_count--
if(worker_count == 0)
pthread_create -> all_workers in next batch(worker_count
updated to n)
worker_stopped=0
}
pthread_mutex_unlock(worker_mutex)
Worker "i" :-
Do some work
pthread_mutex_lock(worker_mutex)
worker_stopped = pthread_self()
pthread_cond_signal(worker_stop_cond)
pthread_mutex_unlock(worker_mutex)
pthread_exit(arg)
The problem is I am loosing some signals which leads to a deadlock(The
manager just hangs in waiting condition even after all workers have
returned).
I debugged the program and it looks like the problem occurs if say
worker "B" locks the "worker_mutex" after worker "A" updated
"worker_stopped" and unlocked "worker_mutex" but before Manager could
lock the "worker_mutex"
I looks like a race condition between manager and other set of stopped
worker to grab mutex after one worker released the mutex.
One solution I could think of it to make the stopped worker signal
only when "worker_stopped" is 0 but is there a better solution to the
problem.
I am not sure if my analyzes is correct and wanted to take opinion of
the community.
I also wanted to know is it "OK" to lock the mutex before launching
the threads. I am doing it so that no thread exits and signals before
the manager calls pthread_cond_wait. Is there a better way to
implement this sync.
Looking forward for some helping solutions.
Thanks