Torque provides a management system for job processing. It is
based on the Portable Batch System (PBS) and formerly the Network
Queueing System (NQS). You will come across the PBS acronym from time
to time in documentation such as this.
Jobs are submitted to job queues, and are then scheduled to
immediately or at some later time, depending on how busy the overall
system is. Users submit the jobs, but Torque decides when a job starts,
chooses which worker node the job runs on, makes sure the job doesn't
overstay its requested time, and manages the return of output files to
the job submitter.
You the user provide a script, consisting of commands to be processed. This might just contain a single command, which might be the name of another script, perhaps with some options or parameters, or the name of a pre-compiled binary file to be run. Or the script might contain a mixture of control statements and commands. The script is submitted using the qsub command.
Here's a simple example of job submission:
qsub myjobwhere myjob is a file you prepared earlier with a simple text-mode editor, containing the following lines:
echo Hello WorldThat's worth giving a try as a first job. But here's a perhaps more likely content of your file myjob:
echo See you later
#PBS -l cput=7200,walltime=7200
#PBS -j oe
gcc -o mybin mysource.c
The first two lines here set job options, which could alternatively been supplied as options on the qsub command. The first line sets a resource limit (with the -l option, lowercase L) of 7200 seconds of processor time and of wall (elapsed) time. The second line requests that command errors are merged with the standard output in a single file. The third line changes the current directory in a job from the home directory to the one current when the job was submitted. The next line compiles a C source with the gcc command, to produce a binary file called mybin. The last line runs that binary file.
When the qsub command is typed in, it replies with a jobid, containing the number of the
job. As soon as the job has finished running, you will find a new file
in the directory, or two files if you didn't request merged errors,
with name(s) by default based on the jobid. You can look at that
output, using the cat or more or less command (for more or less use the space-bar to scroll
through the file, and q to quit), for example:
For more qsub options, see the manual pages for qsub, by entering the command man qsub.
To check the status of your jobs, use the qstat command. It can be used with various options. A qstat command on its own summarises jobs, one line per job. With the -a option (qstat -a), a similar summary is shown in an alternative output, giving some info on requested resources. With the -f option (qstat -f), it shows extensive details for those jobs, or one particular job if you add its jobid.
Jobs can be in a Queued status, or Running. For other states
options consult the manual pages for qstat, by entering the command man qstat.