If you have a problem with a job on Zen first check the output from your job for errors, warnings or signs of unusual behaviour. If the problem seems to be a system problem, rather than a problem with your code, please make a note of the job ID and include it when reporting the problem.

Why is my job still queued?

The command:

tracejob 1234

gives a history of what the queueing system has tried to do with your job.

Is my job actually running?

We have written a simple script to show the load average of the nodes running your job. Just type:

nodestat 1234

or:

nodestat r1i2n3

if you wish to examine a particular node. The load averages shown are the average number of jobs in the run queue over the last 1, 5 and 15 minutes.

You may also log in with ssh but remember that you will be logged out if somebody else's job starts on that node.

-- JohnRowe - 28 Jan 2008

How do I kill a job?

To stop a job in the queue, just type:

qdel Job#

Your job number is found at the far left of your job row when using qstat .

-- ThomasHaworth - 15 Aug 2010

This topic: Zen > WebHome > RunningJobs > ProblemSolving
History: r4 - 15 Aug 2010 - 09:35:58 - ThomasHaworth
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Astrophysics Wiki? Send feedback