OpenMP is a parallelisation method for processors and cores that share the same memory, i.e. in our
case the twelve cores that live on a single node. The most common use is parallising
do
loops.
The node runs a single process which then spawns multiple
threads when reaching a parallel region. Rather than
modifying your actual fortran or C loops you add directives to say you want them to be executed in parallel.
There is an excellent
OpenMP tutorial at the Livermore Computing Center
and I have put a very simple example inside the pbs example directory
/usr/local/examples/qsub_script/
.
Usage
- Use
-openmp
to compile with the Intel compilers.
- Typically you will use one thread per core (i.e. twelve) but you could try twice or half as many.
Independent loop iterations
In general, your loop iterations must be independent of each other to parallelise, i.e. it mustn't matter in which order
the individual iterations occur. A quick and dirty test you can do before trying OpenMP is to reverse the order of
a
do
loop, e.g. try replacing:
do j = 1, n
with
do j = n, 1, -1
If your results change the iterations aren't independent!
Other tips
- You need to decide which variables are "private" to each thread (i.e. each thread has its own separate copy) and which are shared between threads. For example, in a parallelised
do
loop the loop counter variable must obviously be private.
- There is a special
reduce
option for variables which are simply a sum or product over a loop, see the program /usr/local/examples/qsub_script/pi3-openmp.f
for a simple example.
Running OpenMP jobs on Zen
See the example job file at
/usr/local/examples/qsub_script
.
--
JohnRowe - 08 Jan 2008