r47 - 06 Nov 2017 - 10:45:31 - DavidAcremanYou are here: Astrophysics Wiki >  Zen Web  > RunningJobs

Running Jobs

Work is submitted to Zen by putting it into a queuing system which allocates compute nodes to run your job. The qstat command is used to show what is in the queues (to find out more about qstat type man qstat). To delete a job from the queue use the qdel command. To see how many nodes are free you can use the command allfree.

The job file

To submit a job to the queue you will need to set up a job file which describes how the job should be run. Example job files can be found in /usr/local/examples/qsub_script on the log in node. To submit the job file to the queue use the command

qsub jobfile

Jobs should be set up so that the output is written to /data (see FileSystems for more information on how to use Zen's disk space). You will need to include a line in the job file which changes into the correct working directory, as the job file will run in your home directory by default. The environment variable PBS_O_WORKDIR is set to the directory from which the job was submitted, so to run a job in the directory from which it was submitted include the command cd ${PBS_O_WORKDIR} in the job file.

If you are running serial jobs see MultipleSerialJobs for the best way to run several serial jobs on a node.

When running a number of similar jobs it may be useful to set them up as a Job array.

Wall time

Your job file should specify how long your job needs to run for. For example if you know that your job will complete in under 48 hours include the line

#PBS -l walltime=48:00:00

If a job reaches the specified wall time it will be terminated, so make sure you request enough time when you submit the job. However jobs which request shorter wall times are more likely to be scheduled to run, so make sure your wall time is realistic (i.e. don't request a month if you need a day.) If you do not specify a wall time in the job file then the job will be given a default wall time of one day. The maximum wall time is 400 168 hours. The wall time is the time measured by a clock on the wall, rather than any other measure of time (e.g. CPU time).

Queues

The job will be routed to an appropriate queue depending on the wall time specified, as shown in the table below.

Queue name Maximum wall time Number of nodes available to the queuedown
long 400 168 hours 128
medium 72 hours 160
short 24 hours 162

The number of nodes available varies according to how long the job needs to run for because some nodes are reserved for running shorter jobs.

Receiving emails

You can set up the job file so that an email is sent when the job starts, aborts or ends. Use the -m option to control when emails are sent and use -M to set your email address.

Interactive sessions

An interactive session on a compute note can be set up using qsub -I, for example to run a 10 hour interactive session on one node

qsub -I -l nodes=1:ppn=12 -l walltime=10:00:00

MPI

If your job uses MPI you will need to load the appropriate MPI module to ensure that the necessary commands (e.g. mpirun, mpif90) are available. The most recent versions of Intel MPI and the Intel compilers can be made available by running module load ics2013sp1. See CompilingAndLinking for more information on loading modules. For more information on Intel MPI see the Getting Started Guide /opt/intel/impi/4.1.1.036/doc/Getting_Started.pdf and the Reference Manual (/opt/intel/impi/4.1.1.036/doc/Reference_Manual.pdf)

If you need to specify how many MPI processes should run on a node then you can use the -perhost argument with the mpiexec command e.g. to run using 4 MPI processes per node

mpiexec -perhost 4 -genv I_MPI_DEVICE rdma:OpenIB-cma -np $NUMPROCS $MYBIN

The queuing system allocates nodes to run your job and the MPI library uses this information so that it knows which nodes to use. A list of nodes allocated to your job can be found in the file listed in the $PBS_NODEFILE environment variable (e.g. adding the line cat $PBS_NODEFILE to your job script lists the nodes which will be used. The order in which MPI processes are assigned to nodes can be specified using the -machinefile option to the mpiexec command. For example

mpiexec -machinefile mpd_hosts.txt ...

will start MPI processes on nodes in the order specified in the file mpd_hosts.txt.

Scheduling

Zen uses the Maui scheduler to determine which jobs to run, when to run them and which nodes to use. Maui uses advance reservations to schedule the highest priority jobs for a future time and can backfill with shorter, lower node count jobs where possible (e.g. while waiting for 8 nodes to become free in order to run a high priority job it may be possible to run a short, single node job). Running the allfree command on the log in node shows the current number of available nodes and the backfill window.

-- DavidAcreman - 11 Jun 2008

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r47 < r46 < r45 < r44 < r43 | More topic actions

key Log In or Register
Information

Main Web Users Groups Index Search Changes Notifications Statistics Preferences


Webs Main Sandbox TWiki Zen Information

Main Web Users Groups Index Search Changes Notifications Statistics Preferences


Webs Main Sandbox TWiki Zen


 
Astrophysics Wiki


Edit Wysiwyg Attach Printable
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Astrophysics Wiki? Send feedback