A Lightweight Linux-image Engine (ALLiE)
ALLiE is a kit of software which helps setting-up a cluster of PCs or
laptops with Linux systems and configuring and administering those
systems with common system images
and configurations. It allows systems to be installed and updated with
zero
down-time. It was written to accommodate the likelihood of several such
systems on a PC or laptop, separately centrally maintainable, though
that's not essential!
ALLiE is lightweight in that it requires no constantly-running daemons
of its own, and requires no metafiles to describe the files that are to
be downloaded and kept in synchronisation across the cluster.
Furthermore, it's efficient in that file transfer uses rsync, so
overheads are minimal. It's also lightweight in being simple, with a
short learning curve.
Maybe ALLiE shouldn't be said to be an engine as it has no daemon, but
I've done this by analogy to other more elaborate and powerful
methods like CFEngine and Puppet. And it makes for a nice acronym.
Overview
ALLiE provides the methodology and scripts for installing images on PCs
in a cluster.
This includes the cases where a PC or laptop have multiple
images simultaneously, and where those images can be used using the old
long-standing method of multi-boot, or (for me) using a chrooting
invocation like MELEE, or some VM method.
Here an image just means the
complete set of files which comprise an operating system and its set of
applications. Whether that image is the original vanilla tree of files comprising an
unmodified system, or is some customised version, is up the local
administrator.
A first image on a brand-new PC or server can be installed in the
traditional way, with or without kickstart, or can be fetched directly
from the central server holding the image. Further images on that PC
will be fetchable automatically.
Thereafter, for each system image, day-to-day configuration is
performed as an rsync overlay.
By
this
I
mean that there is a centrally-maintained sparse tree of
files which represent the changes to the system from the original
image, and that sparse tree is periodically rsynced by the client
PCs/laptops to overlay any similarly-named files already present. This
is very efficient because it means updates are performed on that small
set of files, rather than the full set of files of the original image
(unlike some other methods using rsync).
Another feature is the use of file tags to designate file versions in
the overlay tree which are destined to become active only on particular
clients. This saves creating a separate overlay tree for such special
cases. For example, the file /etc/hosts.allow might need to be
different on one or two clients while all the other configured files
might be the same. This is optional: you might prefer to have a
separate overlay tree per client type, and/or several overlaid trees
per image.
An important added-on feature is the use of conf scripts. These scripts are
part of the file trees rsynced from the central server, most likely the
overlay tree, and sit in a particular directory of the client,
/root/conf/. These files are potentially run perhaps every hour, as
well as at boot-time, but, if a particular script runs and exits with a
zero return code, then it's satisfied
and so will not be re-run, in the normal course of events. It therefore
acts like an assertion that the client system has performed some
action. Uses of such conf
files are many, but a reasonable example is to install packages which
are additional, perhaps conditionally additional, to the original
image. When the package has been installed, the script can exit with a
zero return code, and the script won't need to be invoked again. If a
client is switched off for a while and then switched on,
the methodology ensures that it automatically updates to the latest
state,
without intervention.
For a case where a client has multiple images, all active at the same
time, the overlay step and the conf
scripts can run independently within their separate images and so keep
each of those images separately up to date.
Example
At the time of writing I have around 70 PCs and interactive servers
which require the latest packages on a Fedora 18 system, and have
available an older Fedora 15 system to which users can move from/to at
their own pace, and have SL4 and SL5 and SL6 systems (and a Ubuntu
system in one or two cases) for software compatibility with
experimental collaborators. Each PC has those systems installed, in
their own partitions, and each system separately keeps itself up to
date by reference to central server images. When a new system comes
along, it can be added on the central server and then each client
remotely without the user even knowing that this is taking place. They
can then start making use of that system, or even reboot to make that
system their base system, at a time convenient to themselves, without
any down-time.
Important scripts
The following scripts can be downloaded as part of the overlay area:
/root/conf/run-conf: this script invokes number-sequenced scripts in
the same directory.
/root/conf/00overlay: this script invokes /root/util/rsync-image to
overlay the current image with its overlay file tree. As you can
imagine, this script can therefore bring in further conf scripts as
well as other files. It keeps a list of files that have been fetched
on this occasion (could be empty).
/root/conf/01overact: this script scans the list of files that rsync
has just fetched, and acts on any tag classes that match this server.
/root/util/rsync-image: this general script invokes rsync as a client
to fetch files from a central server. It can be invoked either to load
the initial image of the system, or to fetch an overlay area: the
sparse tree of files which logically overlay that initial image. Since
rsync in the -a mode contains
all the logic to ensure that it will only fetch files that have been
added or changed since the last fetch, this is fast and efficient.
Testing scripts
A /root/conf script can easily be introduced in test mode by putting
it into the overlay image but not marking it as executable.
It can then be run selectively on a chosen test PC to see if it
works satisfactorily.
For example, on the test PC you could run /root/conf/00overlay,
followed by /root/conf/01overact (if necessary),
and this will have been sufficient to fetch updated files
that you can then test.
Alarms
Some /root/conf scripts might do non-trivial processing and
so ideally are not run often.
An example is a script which checks to see if packages in this image
are up-to-date.
To avoid overhead, such scripts can touch a particular file in /root/done/,
time-stamped for a week hence (for example).
When the script sees that its alarm has been passed, it runs in full.
Otherwise it exits quickly with a non-zero (unsatisfied) return code.
Index