Resource for running ALICE MC production and analysis on BlueBEAR

Getting Started

Register on BlueBEAR via the web interface page which can be reached here.

You login using ssh with your usual University (ADF) username, eg abc123 for students or smitha for staff.

Login is only available from machine with bham.ac.uk addresses eg eprexa.

Use the password notified to you. It can be changed with the standard passwd command.

Send a Helpdesk request, type 'Request', Subject area 'BEAR', sub-area 'Other', asking to be added to the "p-alice" group. This will allow you access to the shared software (and eventually we hope storage) area.

When you first login you will have a home area /bb/phys/abc123 and hopefully a directory in the not backed up space /nbu/phys/abc123. If you do not have the latter then again send a Helpdesk request as above but with sub-area 'Storage' asking for your not backed up space to be created as you anticipate exceeding the usual backed-up quota (currently 50 GB). As the name implies the area is not backed-up so never store code, macros, scripts etc. there, only data which you could re-create be re-runnning.

Finally change the permissions on your home directory so that it will be group readable, using the usual chmod command, eg ' chmod g+r ~ ' and ' chmod g+x ~ ' This will enable us to help each other effectively by eg copying examples so do it now and there will be no delays in the future.

Some General Hints

These are mainly extracted from the documentation.

Don't run jobs on the login node(s). They will be de-prioritised and eventually killed by the system. Use the login nodes only for light tasks such as submitting jobs, editing code etc. Other interactive tasks can be accomplished by submitting an interactive batch job with ' qsub -I ' (the argument is an upper case i and not a lower case 'ell'). This gives you a shell on a worker node.

You can check your disk usage versus the quota by using he bbquota command.

[Please add your own hints …]

ALICE Software setup

The ALICE sofware area is /apps/hep/alice

The currently installed ALICE releases are listed below. If you add another release then please document it here.

ROOT version GEANT version AliRoot Date By
root-v5-24-00 geant-v1-11 AliRoot-v4-17-Release 2009-09-04 Plamen

You can use a release by sourcing the script setting the environment. Eg

source /apps/hep/alice/SetEnv/SetEnvAliRoot-v4-17-Release

Currently we are compiling using SL4 and therefore the SetEnv script starts with #!/bin4/bash so that we run using SL4 too.

There is an issue which currently prevents compilation 'out of the box'. The line:

cout << "aliroot " << ALIROOT_SVN_REVISION << " " << ALIROOT_SVN_BRANCH << endl;

must be commented out of the file $ALICE_ROOT/ALIROOT/aliroot.cxx

The explanation is that ALIROOT_SVN_REVISION and ALIROOT_SVN_BRANCH should be substituted into the code during the make process but they become null strings if svn is not available and the code cannot then compile (because it has << << in it). Solutions would be for the AliRoot makefile to be fixed and/or svn to be made available throughout BlueBear (and in SL4).

Running MC production

In initial testing very good throughput of events was achieved with 50,000 events produced (generated, simulated and reconstructed) in less than 16 hours. This is because a large number of nodes can be employed in parallel. Longer runs are more efficient as more nodes usually become available as a function of time.

To do this two things are needed. First a script which can be given to the qsub (batch submission command). This script needs to; source the AliRoot environment, copy the necessary .C and .sh files from $ALICE_ROOT/test/fpprod to a working directory, run the .sh file and finally copy back the necessary .root and .log files to some data area. Secondly a small script to run a loop submitting N jobs, where N is determined by the number of events needed and the number of events per job is used. Examples of each are /bb/phy/barnbyl/prodtest/runfptest.sh and /bb/phy/barnbyl/prodtest/submit.pl

This scheme has a couple of advantages:

  • Each node is only opening files on its local disk for the duration of the jobs (~90 minutes) and copying them to the filestore at the end (a few seconds).
  • Any files not required such as *.Digits.root, *.Hits.root and raw.root can be deleted before the copy which saves a factor of 4 in disk space.
This is only a basic schema and one can add other possibilities; use modified macros copied from home region; use a specially configured AliRoot with eg PYTHIA 8; copy the data at the end of the job directly to a different machine, bypassing the BlueBear filestore.

-- LeeBarnby - 19 Oct 2009

Topic revision: r4 - 20 Oct 2009 - 15:56:55 - LeeBarnby
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback