making use of multiple processors (multicore)

Discussion:

(too old to reply)

scott

2007-01-10 15:37:09 UTC

I'm looking to develop a cross platform (Windows / OSX / Linux)
project. It will require two threads both of which will be very
processor time intensive. There will be a need for some data sharing
between the two threads. Now, due to the intensive processor usage,
on a multi core system I'd like to take advantage of the extra cores.
I'm familiar with multithreaded programming on single core systems, I
do not know how that may apply to a multi core system (seeing as I
work in the desktop area which has only recently started to see multi
core systems). How can I be sure I'm taking advantage of extra cores
on a system that has one?

Do I continue with what I would consider regular threading, which can
share a memory space between threads and data sharing is controlled
with mutexes, etc? I'm not familiar enough with real parallel
computing to know whether or not a mutex works just as well at locking
data shared by threads on different cores as it does on a single core
solution. In fact, I'm not even sure that a single process can start
threads on different cores.

Do I need some other solution, whereby my threads are separate
processes in themselves and therefore may not have a shared memory
space, instead sharing data through some other means?

In either instance I would want the power to decide and force the two
tasks onto separate cores. Whilst I appreciate that much of the time
you may want the OS to determine what core to put a process/thread on
when it's started, I will know beforehand what the processor loading
is. Additionally, the software will be run on a closed, clearly
defined system, so I will not need to worry about what other loads may
be placed upon it by other users.

Any advice and insight would be much appreciated, thanks.

llothar

2007-01-10 19:56:16 UTC

Permalink

From the level of an application programmer there is no difference

about Multi-CPU and Multi-Core.

And this question shows that you really need to buy a good book on
mulithreading programming . My recommendation is the Pthread
programming book from SUN.

Gianni Mariani

2007-01-10 20:52:57 UTC

Permalink

Post by scott
I'm looking to develop a cross platform (Windows / OSX / Linux)
project. It will require two threads

...

Post by scott
Any advice and insight would be much appreciated, thanks.

For C++ there are a number of thread frameworks that are cross platform.
boost, Austria (shameless plug), ACE are a few.

For C, I believe there is a pthreads look-alike for Win32. I don't
remember it's name but I'm sure someone else on this list can fill you in.

Make sure you understand how to write MT code.
a) reliably/robustly avoid deadlock
b) minimize memory contention
c) avoid race conditions

This usually means messaging, or passing "messages" between threads.

Now if you're using threads for the sake of performance, make sure you
use the CPU's cache to your best advantage, this alone can have a 100x
or more performance benefit.

Oh, yeah, read up alot.

r***@yahoo.com

2007-01-12 01:43:16 UTC

Permalink

Post by scott
I'm looking to develop a cross platform (Windows / OSX / Linux)
project. It will require two threads both of which will be very
processor time intensive. There will be a need for some data sharing
between the two threads. Now, due to the intensive processor usage,
on a multi core system I'd like to take advantage of the extra cores.
I'm familiar with multithreaded programming on single core systems, I
do not know how that may apply to a multi core system (seeing as I
work in the desktop area which has only recently started to see multi
core systems). How can I be sure I'm taking advantage of extra cores
on a system that has one?
Do I continue with what I would consider regular threading, which can
share a memory space between threads and data sharing is controlled
with mutexes, etc? I'm not familiar enough with real parallel
computing to know whether or not a mutex works just as well at locking
data shared by threads on different cores as it does on a single core
solution. In fact, I'm not even sure that a single process can start
threads on different cores.
Do I need some other solution, whereby my threads are separate
processes in themselves and therefore may not have a shared memory
space, instead sharing data through some other means?
In either instance I would want the power to decide and force the two
tasks onto separate cores. Whilst I appreciate that much of the time
you may want the OS to determine what core to put a process/thread on
when it's started, I will know beforehand what the processor loading
is. Additionally, the software will be run on a closed, clearly
defined system, so I will not need to worry about what other loads may
be placed upon it by other users.
Any advice and insight would be much appreciated, thanks.

Certainly Windows and Linux will, by default, run threads from a single
process on whichever cores are available in and SMP system. I expect
OSX to do the same.

The basic approach of protecting shared data items with mutexes of some
sort works on both single and multiple processor systems (although
there are other approaches), there's an old saying that multi-threaded
code isn't working until it's been run on a multiprocessor system. The
problem is that there many threading effects you can see in a
multi-processor system that you don't see in a single core system
running multiple threads. On x86, many of those have to do with the
fact that instruction execution (particularly with regard to memory
updates) is atomic on a single processor but *not* on a multi-CPU
system unless special precautions are taken. For example, the x86
instruction "inc memloc" when executed in two threads on a single core
system will always add two to memloc. On a dual-processor system, both
incs can run in parallel, and can step on each other (eg. Both can
fetch the old value, the increment it, and then both can store the
"new" value back, which means you've lost one of the increments).

Even worse, running multiple threads on a single processor often masks
unserialized data accesses because thread switches almost always occur
in "nice spots" (when your thread blocks, which is "safe" almost by
definition - preemptive switches usually being very rare). Drop the
code on a real multiprocessor and suddenly thread actually *are*
running in parallel, and race conditions of all sorts pop up. Consider
the sequence: (acquire mutex, muck around with shared object for 50
instructions, release mutex). On a single CPU system the 50
instructions will almost always execute without interruption even if
you omitted the mutex entirely - only if a preemptive task switch hits
right in that section will there be any chance for two updates to
happen in parallel. On a multi-core system running through there in
parallel is *much* more likely, and if the mutexes are omitted (on
inadequate), you have a vastly greater chance of actually seeing a
failure. With the mutex correctly in place, the code is fine on both
types of system.

Now all that is really not a design problem, rather latent bugs in you
code, which just happen to be masked by the much kinder single-CPU
environment. So the basic issue will be lots of testing, and some code
review to make sure all the synchronization you need is really in
place.

While you can force a thread to run on a particular CPU, it's usually
(there are exceptions) not a great idea, most people make things worse.
The OS will dispatch a runnable thread on an available CPU, so if you
have two runnable threads and two CPUs they're run on both CPUs
automatically. It does "not* depend on which CPU the thread was
started on. Playing with thread/processor affinity can help with cache
usage (bouncing a thread from one CPU to the other can be expensive
since all the cached data the gets to migrate to the new CPU). But if
you set limits on what can go where you can easily end up with
situations where you have two threads ready to run but limited to one
of the CPUs, and the other CPU without any eligible work. Save that
sort of thing for later optimization if you determine that it would
actually be useful.

Chris Thomasson

2007-01-16 06:58:22 UTC

Permalink

Post by scott
Any advice and insight would be much appreciated, thanks.

http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce

http://groups.google.com/group/comp.programming.threads/browse_thread/thread/e0c011baf08844c4/3ca11e0c3dcf762c?lnk=gst&q=multi-mutex&rnum=1#3ca11e0c3dcf762c

Thats about all you need to do good multithreading. Well, if you need
message passing, at lease use a top-notch queue algorithm:

http://appcore.home.comcast.net/

Chris Thomasson

2007-01-16 06:59:45 UTC

Permalink

Post by Chris Thomasson

Post by scott
Any advice and insight would be much appreciated, thanks.

http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce
http://groups.google.com/group/comp.programming.threads/browse_thread/thread/e0c011baf08844c4/3ca11e0c3dcf762c?lnk=gst&q=multi-mutex&rnum=1#3ca11e0c3dcf762c
Thats about all you need to do good multithreading. Well, if you need
http://appcore.home.comcast.net/

I would also suggest that you drop pthread condvars in favor of eventcounts:

http://groups.google.com/group/comp.programming.threads/browse_thread/thread/aa8c62ad06dbb380/8e7a0379b55557c0?lnk=gst&q=simple+portable+eventcount&rnum=1&hl=en#8e7a0379b55557c0