TWiki
>
Computing Web
>
LocalGridJournal
>
GridFabricManagement
(07 Feb 2011,
_47C_61UK_47O_61eScience_47OU_61Birmingham_47L_61ParticlePhysics_47CN_61christopher_32curtis
?
)
(raw view)
E
dit
A
ttach
---+ Fabric Management A summary of the Birmingham Grid Fabric Management. %TOC% ---++ Introduction Most grid sites use some sort of fabric management software to install, configure and monitor their nodes and services. The Birmingham nodes are managed through the use of cfengine. Other sites use Quattor and Puppet. ---++ Services not included The UI installations and BlueBEAR Worker Nodes are not managed by cfengine. They consist of tarball installations, and must be installed manually. You can find instructions for managing these installations here: [[https://www.ep.ph.bham.ac.uk/twiki/bin/view/Computing/LocalGridCookbook#BlueBEAR_WN][BlueBEAR WN]] [[https://www.ep.ph.bham.ac.uk/twiki/bin/view/Computing/LocalGridCookbook#UI][Local UI]] ---++ Structure The cfengine head node is epgmo1.ph.bham.ac.uk. This node keeps a master record of all the relevant configuration files, which are then copied out to all nodes under the control of cfengine. In addition, epgmo1 also maintains a repository of binaries and scripts that are copied out to relevant nodes. The Birmingham definition works by mapping physical machine names onto roles within cfengine (eg =epgse1.ph.bham.ac.uk= is currently mapped onto the =dpm_head_node= role). This has the advantage of simplifying the process of redeploying a machine. For example, if the Site BDII services are to be moved onto a new machine, it should be the case that only one line must be edited in the cfengine configuration files. ---+++ Node Initialisation When a node is installed via kickstart, cfengine should also be installed automatically. The kickstart script will attempt to download the files =cfservd.conf= and =update.conf= from the web server running on =epgmo1.ph.bham.ac.uk/= using wget. These files are stored locally on epgmo1 in =/var/www/html/config=, and configure the local installation of cfengine to recognise epgmo1 as the master node. It should then be possible to configure the new node from epgmo1 using the =cfrun= command. More information about kickstarting and configuring nodes can be found in the LocalGridCookbok. ---+++ Configuration files The main configuration file is called =cfagent.conf= and can be found in =epgmo1:/var/cfengine/inputs=. This file currently defines the main =actionsequence=, and a mapping of physical machines names to cfengine roles. The first file to be imported is =epgmo1:/var/cfengine/inputs/imports/classes.conf= This file defines a set of cfengine classes, which are used to steer the other configuration files. For example, this file defines the =glexec_wn= class, consisting of all nodes that are marked as worker nodes. It also contains a list of commands, such as =restart_maui=, that can be used to execute particular actions on specific nodes. The second file to be imported is =epgmo1:/var/cfengine/inputs/imports/global.conf=. This files defines actions that are applied to all nodes (such as copying iptables rules out to nodes, or editing =/etc/hosts.allow= to enable ssh connections from the local system). Finally, there are number of role specific config files that are used to install and configure specific glite services. For example, configuration details specific to the APEL node are defined in =epgmo1:/var/cfengine/inputs/imports/apel.conf=. There is more information about these specific roles below. ---+++ Modules There are a number of module files (written mainly in bash) located in =epgmo1:/var/cfengine/inputs/modules=. These scripts generally execute actions that are too cumbersome to complete in cfengine. For example, the firewall is managed from the =iptables= module, and system backups are managed from the =backup= module. ---+++ File Repository Some nodes require extra binaries and large files that are not always available over the net or are easily created from within cfengine. In these cases, the relevant files are copied by cfengine onto the relevant nodes from a repository available in =epgmo1:/var/cfengine/inputs/repo=. There is a directory in the repo folder for each role. Some roles have subdirectories, eg =alice_vobox/= contains subdirectories for the =bb_alice_vobox= and the =twin_alice_vobox=. There is also a =repo/general= directory for files that need to be copied onto all nodes (ganglia config files for example). ---++ Known Limitations and Problems -- Main.ChristopherCurtis - 08 Oct 2009
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r3 - 07 Feb 2011
-
_47C_61UK_47O_61eScience_47OU_61Birmingham_47L_61ParticlePhysics_47CN_61christopher_32curtis
?
Computing
Log In
Computing Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
Webs
ALICE
ATLAS
BILPA
CALICE
Computing
General
LHCb
LinearCollider
Main
NA62
Publish
Sandbox
TWiki
Welcome
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback