Elementary Particle Physics @ Birmingham

BlueBEAR Cluster PP Guide

Filename bluebear. Last update 20081104. Part of Guide to the Local System.

BlueBEAR

The BlueBEAR cluster has been available to users since October/November 2007. For details, see IT services documentation.

The system uses Torque and MOAB as the job manager and scheduler respectively. It uses the IBM GPFS system to provide the shared user filesystem. GPFS stands for General Parallel File System.

The system version

As of October 2008, the system version of BlueBEAR is Scientific Linux release 5 (SL5.2). But it has a Scientific Linux 4 (SL4.7) tree, which can be used to run SL4 programs. Both systems are in principle 64-bit systems (x86_64 architecture), but both are intended to have both 32-bit and 64-bit libraries for almost all packages.

Using SL4 in a login session

For a login session in SL4, after logging in type in: /bin4/bash. To drop back to the SL5 environment, type exit.

To run just one particular script in SL4, you can do this without going into an SL4 login session. Let's say the first line of the script is #!/bin/sh. You can either enter:

         /bin4/sh myscript    any args

or put #!/bin4/sh as the first line of that script, and then run it simply by:

         myscript    any args

The following are available at the time of writing: /bin4/sh, /bin4/bash, /bin4/ksh, /bin4/zsh, /bin4/csh, /bin4/tcsh.

If you choose to do this /bin4/ script modification, note that inside the SL4 environment, the /bin4/ binaries also exist, but are simply soft-links to the /bin/ binaries, so behave in a consistent way, and so there is no need to keep a second copy of the script which has the conventional invocation as its first line.

Scripts which are invoked by other scripts already in an SL4 environment are run in that same SL4 environmnent. That applies in an interactive session and in a job. So there is no need to invoke those in a special way or go to the trouble of modifying them.

Using SL4 in a submitted job

You can submit a job to run in SL4 from within a SL4 or SL5 session. The method is exactly the same: jobs do NOT inherit the operating system of the submitting system (that is, the system doing the qsub).

To run a job totally within SL4, submit it with the qsub option -S /bin4/bash. That is, either use that option on the qsub command line, or put this in the submitted job script:

         #PBS -S /bin4/bash

Alternatively, you can choose that the script only (and not the initialisation of the job) is run under SL4, using one of the techniques described above for login sessions: using the /bin4/ binaries either to invoke a script, or as the first line of a script, possibly the job-script.

Knowing what system you're on

So you always know which release you are on, you can add the following to your $HOME/.bashrc, but remember that in a job this will only give useful information if you use the -S method to choose SL4:

         release=$(lsb_release -r -s)
         echo You are running on release $release >&2
         PS1=$release-$PS1

(The redirection on the echo is important, as always inside a .bashrc, if you want scp and sftp to continue to work).

Differences from the BlueBEAR documentation

Notice that the methods described here are slightly different to the BlueBEAR-documented binary /bin/bashsl4, which has some deficiencies:

it doesn't allow through the PBS variables like PBS_JOBID and PBS_O_WORKDIR,
it doesn't accept arguments,
it can't be used in a script #!interpreter line,
it doesn't preserve the current directory,
it doesn't preserve any changed umask.

So by preference, use the /bin4/ binaries. For any problems with those binaries, see me, rather than eHelpdesk.

How does it work?

The SL4 system within the login and worker nodes is a fairly-complete installation just as with a native SL4 system.

Additional filesystems which you would normally expect to see in a native system (like the filesystem that contains your $HOME directory, and the /tmp filesystem) are specially mounted inside the SL4 image too.

When you enter one of the special SL4 commands like /bin4/sh, then the command changes the filesystem root to the root of the SL4 installation, changes the current directory to the same directory but in the SL4 image, and then invokes the command of the same name in the SL4 image. So /bin4/bash invokes bash in the changed-root system. Because your $HOME and other files are mounted in the SL4 image also, you can continue to see them.

(This is a bit different to the normal use of the chroot facility, which normally is used in order to give less access to system facilities, rather than similar access to a different system).

Your programs will therefore use the scripts and run-time libraries and other system files of the SL4 installation, rather than the native installation. So they are operating in a virtual application environment. Those libraries may call a system kernel interface to provide certain facilities, and in this case that kernel is the SL5 kernel, not a SL4 kernel. However, the assumption is that the kernel facilities are backward-compatible between SL5 and SL4.

As an analogy, consider a fully-resolved (static) program. If and when a program is compiled on one system with all library references resolved, and so with no run-time libraries, and you copy it to a second different system, you would expect it to work flawlessly, if the second system was later but backwards-compatible with the first system. It would be a very brave or foolhardy vendor or kernel developer who altered the kernel interfaces to cause such a static program to fail. In our case, we are not fully-resolving the library references but instead we are providing all the run-time libraries of the first system too; we rely equally on the backward-compatibility of the kernel interfaces, and so for the same reason we expect that method to work flawlessly too.

Why implement it that way?

Why implement SL4 as an image inside SL5, when the ClusterVision OS allows for different workers having different images?

Well, having worker nodes which run SL4 natively was certainly an option, and is still an option if for any reason the current technique fails to live up to expectations.

One thing in favour of the chosen method is that we don't need to consider the numbers game, of how many SL4 nodes and SL5 nodes. A worker node can run a mixture of SL4 and SL5 processes, rather than being dedicated to one or the other. Another benefit is running with a single kernel version, which the vendor believed would improve GPFS stability.

Another scenario could be to run the different systems virtually, using Xen or another virtualisation technique. However, that would involve GPFS running on different kernel versions, with two instances per worker node, and the cluster vendor was not in favour of that approach, having already declared a preference for having one kernel version for GPFS throughout the cluster.

Future of SL4 on BlueBEAR

It was officially confirmed at the Bear User Forum of October 2008 that SL4 would continue to be available on BlueBEAR as long as it was required.

Of course, having the two systems readily available makes it easy to test future migration to SL5, as will happen anyway. SL5 is a mature system, based on Red Hat Enterprise Linux (RHEL) 5, which was first available in March 2007.

There is a range of dates for the likely release of RHEL6, the earliest being March 2009.

L.S.Lowe.