Parallel tasks without MPI
Sometimes, some tasks can be usefully performed in parallel without the
need to use an MPI, and for these the pbsdsh
command is useful. Here is an example of a 8 processor job using pbsdsh:
Since the same "myscript" is run on each of the processor cores of a
job, that script needs to be clever enough to decide what its role is.
Of course, if the task is identical on every processor core, then
that's simple. But in the case where each processor core should be
doing a different task, then you can make use of an environmental
variable called $PBS_VNODENUM. This variable takes a value from 0 to
c-1, where c is the number of processor cores allocated to the job, and
is set by the torque system when it invokes the pbsdsh'd script on each
core. So if you have pre-prepared several lower-level scripts named
mysub.0 to mysub.7, your file "myscript" might contain:
#PBS -l nodes=4:ppn=2
#PBS -l walltime=5:00:00,cput=20:00:00
#PBS -j oe
.... initial processing ....
pbsdsh -v $PBS_O_WORKDIR/myscript
.... final processing ....
or, if you have pre-prepared a program myprog and a set of different
data-files, mydata.0 to mydata.7, for the tasks, then
Let me know of other, innovative methods of using pbsdsh.
myprog < mydata.$PBS_VNODENUM
Note that there is also the variable $PBS_NODENUM, which has a unique
number 0 upwards for each different node, so 0 to 3 in the above
example, but observe that this is not so useful in the above context as
$PBS_VNODENUM. Also there is the variable $PBS_TASKNUM, which
is incremented before each task on each core is started.
Initial environment of a script invoked by pbsdsh
A script invoked by pbsdsh starts in a very basic environment: the
user's $HOME directory is defined and is the current directory, the
LANG variable is set to C, and the PATH is set to the basic /usr/local/bin:/usr/bin:/bin as
defined in a system-wide file pbs_environment. Nothing that would
normally be set up by a system shell profile or user shell profile is
defined, unlike the environment for the main job
script. To be positive about this, you could say that it this
is very efficient, particularly if you use pbsdsh repeatedly in your
main job script, as it eliminates unnecessary overheads!
The first thing such a script is likely to need to do, therefore, is to
change directory to $PBS_O_WORKDIR, and to set the PATH to $PBS_O_PATH.
Be careful, because this approach assumes that when you submit the job,
the environment in which you submit it is the one you want when it is
running. Alternatively, it might be sensible for the script to source a file containing all the
definitions of environment that your job script requires.
Yet another choice is for the pbsdsh command in your main job script to
invoke your script via a shell,
like sh or bash, with or without the "-l" login-shell
option, so that it gives an initialised environment for each instance:
pbsdsh bash -l -c '$PBS_O_WORKDIR/myscript'
In detail, the initial environment of a command invoked by pbsdsh has
the following defined, listed alphabetically. Notice that this list of
variable names is
the same list as for a main job script (see the Torque details page),
that PBS_NODEFILE is not defined on secondary nodes.
Questions of efficiency when running multi-core jobs
When considering running different processes on different nodes/cores
as part of a multi-core job, be aware that some processes may finish
well before others. Therefore the cores that those processes were using
will be idle until all the
pbsdsh-invoked processes have finished. Your job effectively reserves
all the cores you requested for the total duration of the job: busy or
Some inefficiencies are
inevitable in this sort of parallel environment, if the parts running
in parallel are not identical. But this can make the cluster as a whole
inefficient. Your user and group fair-shares are based on core
wall-time occupancy, not on actual processing, so idle cores are still
charged for in fair-share terms, and will count against you and your
group for future jobs. So do not devise jobs to work in a parallel way,
if there is little benefit in doing so, if they can perfectly
adequately run as multiple single-core jobs.