Athena Access to the SE

You can list files on the SE using either rfdir or dpns-ls:

rfdir /dpm/ph.bham.ac.uk/home/atlas/atlasmcdisk/mc08/
dpns-ls -l /dpm/ph.bham.ac.uk/home/atlas/atlasmcdisk/mc08/

Athena ultimately requires a list of files, so perhaps a script to take a DQ2 dataset name, find it's location on the SE and then output filelist.py would suffice?

Setup (User)

RFIO relies on the library libshift.so, but Castor and DPM required two different versions, Castor being the default. To fix this, in the run directory:

ln -s /home/lcgui/SL4/prod/lcg/lib/libdpm.so libshift.so.2.1
export LD_LIBRARY_PATH=`pwd`:$LD_LIBRARY_PATH

This allows for (the very slow) access to the SE using RFIO. !!! The LD_LIBRARY_PATH variable must be re-exported after every compile!!!. Users will also require a valid grid certificate.

Finding Files

The sancho.sh script may be used to locate files on the SE via their DQ2 label. It can be used to output a filelist in two formats - with and without Athena decorations.

Example 1:

source sancho.sh mc08.105502.AcerMC_tchan.recon.AOD.e352_s462_r541           #Prints a list of files suitable with one file per line (suitable for ganga)

Example 2:

source sancho.sh -a mc08.105502.AcerMC_tchan.recon.AOD.e352_s462_r541       #Prints a list of files with Athena decorations ("ServiceMgr...." etc)

Testing

Volume of Data Transferred

The RFIO buffer was set very low values and 250 ttbar events were read. When reading an AOD, athena appears to read 3 header blocks before starting to process the data. The table below shows the amount of data read for various buffer sizes.

RFIO Buffer Size Block 1 Block 2 Block 3
64 99600 254650 99600
1024 108070 266640 108070
4096 130624 302068 130624

It remains to be seen if the size of these headers is dataset dependent. Probably true.

Athena goes on to request more data when event processing starts, the volume being dependent on the the number (and presumably type and size) of the StoreGate collections requested. For a skeleton algorithm (no StoreGate access), the 64 byte buffer requests approximately 20 kB per event. Each additional StoreGate collection requested appears to add ( very roughly!) 16 kB per event to the total data requested. It's easy to imagine each event requiring at least 100 kB of data.

It remains to be seen if athena requests data event by event or if it fills a buffer, processes all the complete events and then requests the buffer be refilled. I'd probably go with the later.

Variable RFIO Buffer Size

The time taken to iterate over 1000 ttbar events from the sample mc08.105200.T1_McAtNlo_Jimmy.recon.AOD.e357_s462_r541/ for various RFIO buffer sizes is shown below. The test was completed on a desktop machine. The plot shows that whilst the absolute time is dependent on the number of persistent StoreGate collections retrieved, the overall trend for smaller buffers being better unless you can afford to read the entire file.

rfio.png

From a single user point of view, 4 kB buffers are the best solution. Will this put too much of a drain on the SE though? Presumeably this means that once the user has consumed 4kB, it will immediately request more data. I wonder if this scales as the number of clients increases?

Cluster location
CJC Desktop /home/lcgui/local/shift.conf
Twin Farm /home/lcgui/local/shift-farm.conf
Other Desktops and eprexa /home/lcgui/local/shift-desktop.conf
epgse3 WNs /egee/soft/dteam/shift.conf
epgce4 WNs /egee/soft/local/shift.conf

Job Submission

In terms of user job submission, reading data from the SE is still a little tricky. Firstly, the environment needs to find the correct libshift.so. This can be achieved by sourcing lcguisetup after running the athena setup scripts.

Secondly, the grid proxy certificate must be made available on the worker node. The easiest way to do this is to voms-proxy-init --voms atlas as per normal. This will create a proxy in the /tmp directory. The name and location of the proxy are held in the environment variable $X509_USER_PROXY The proxy should be copied to the run directory or similar and $X509_USER_PROXY updated as appropriate.

The job may now be submitted. Unfortunately, it is not possible to submit the job using Ganga. When a ganga job runs on a worker node, athena setup scripts are rerun. Changes to the $LD_LIBRARY_PATH variable are prepended, meaning that athena once again finds the wrong version of libshift.so. Find out from Liverpool what they did to fix the problem.

-- ChristopherCurtis - 29 May 2009

  • sancho.sh: sancho.sh - tool for finding files on the SE and printing filelists
Topic attachments
I Attachment Action Size Date Who Comment
pngpng rfio.png manage 24.3 K 11 Jun 2009 - 15:17 ChristopherCurtis  
shsh sancho.sh manage 3.2 K 10 Jun 2009 - 17:37 ChristopherCurtis sanch.sh - tool for finding files on the SE and printing filelists
Topic revision: r10 - 02 Jul 2009 - 10:47:00 - ChristopherCurtis
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback