June 2016: This page appears to be substantially out-of-date. TLA/Bluestone, the tla/ directory structure, Grid computing, and links to spy-hill.net seem to no longer be a part of the ELabs LIGO e-Lab. LIGOtools is still used. -JG
The LIGO Analysis Tool is now called Bluestone
. It is named after the Bluestone gold mine in Alaska. In contrast to hardrock mines, the Bluestone mine is a placer ("plass-er") mine. Placer mining sorts through alluvial deposits, often using hydraulic methods, to find a few flakes or nuggets of gold embedded among all the otherwise uninteresting sand and earth. Think of panning for gold, but scaled up to industrial size. By analogy, LIGO environmental data may be thought of as the alluvial deposits, in which we expect to find little bits of otherwise lost gold. And the power of "grid" computing will allow us to scale up the process to industrial size.
The name Bluestone
can also be interpreted as an indirect reference to the Earth itself, which from space looks like a "little blue marble". While LIGO is primarily aimed at collecting data from sources out in space, away from Earth, the Bluestone tool was initially developed to look at data from sensors which monitor the Earth itself.
The preliminary name of the LIGO analysis tool was "TLA", which was sometimes fleshed out in military form as "Tool, LIGO Analysis" just for fun.
Actually, the term "TLA" was just an amusing place saver, based on the observation that so much of the software developed for LIGO is identified by Three Letter Acronyms (TLA's). The shorthand abbreviation of TLA to represent the LIGO analysis tool will likely remain, since TLA is also the airport identifier for the gravel airstrip near Teller Station, Alaska, which is near the Bluestone mine.
Bluestone code is maintained in CVS and can be obtained from
cvs -d :pserver:firstname.lastname@example.org:/usr/local/cvsroot/i2u2 co ligo/tla/
Released versions are also checked in to the I2U2 SVN. A listing of which version of Bluestone is installed on a given machines may be found on the Bluestone Versions
page. Instructions for creating a new release branch are on the page Bluestone Update
Changes are also often summarized on the Logbook/Disussion site, either in the Bulletin Board
for public information, or in The Boiler Room
for things intended only for developers.
The LIGO analysis tool consists of several different components brought together with a web interface to make it easy for students to run an analysis from a browser. The components include:
- ROOT - from CERN (http://root.cern.ch)
- Frame Libraries - consisting of FrameCPP (C++ code) from LIGO's LDAS, and Fr (C code) from Virgo. LIGO data are recorded in IGWD frame files, and these routines provide methods for accessing and manipulating data in Frame format.
- Dynamic libraries for DMT container classes and functions - from LIGO (part of GDS)
- DMT macros for ROOT - from LIGO (part of GDS)
- Web interface scripts (TLA) written in PHP - from Eric (via his CVS - see below)
Instructions for building and installing ROOT, GDS, and Bluestone from source coee will appear in a separate page, BuildingBluestone.
The rest of this section is background on different ways we might build and distribute, or distribute and build, the software.
There are two possible ways we might get the LIGO software working: either install pre-made versions with the LIGO package managment system, or build from source. Eric investigated both, and found that building from source was the quickest way to get something to work. Here are the main considerations:
- LIGOtools: LIGO has a package managment tool called LIGOtools which can easily install specific packages on a Linux computer. It is similar in spirit to rpm or dpkg. The packages needed to run an analysis might be installed easily with LIGOtools, both on the "local" server and on remove Grid nodes. LIGOtools has its own version of ROOT (v3.02.07a, really rather old) while the LIGO control rooms and Eric's working versions of the Analysis Tool uses ROOT v4.04 (also old by now). Since the LIGOtools approach is most likely to transfer easily to Grid nodes, Eric first investigated getting the Analysis Tool working at Argonne using that approach. However, he found that using just LIGOtools does not seem to work, because the versions of the software currently available via LIGOtools is too old. Update: As of Dec 2008 there seems to have been a lot of work on LIGO software during the shutdown before S6, and so there may well be newer versions of LIGOtools packages for our purposes.
- GDS: Build from source: While the LIGO software has historically been developed from multiple sources, it is now bundled together into one large system called the Global Diagnostic System (GDS). This includes the FrameCPP library, the Data Monitoring Tools (DMT), and the Diagnostic and Testing Tools (DTT). Eric has built GDS on his test server and has notes on how to do so at http://www.spy-hill.net/~myers/help/ligo/GDS-build.html. He's been working through all the dependencies for built tools on www13.
Once GDS builds on www13 there are likely two different ways we might try to deploy the software on a remote Grid node:
- Build and install the whole GDS once on the remote node. (What if an unnecessary part fails to build?)
- Build GDS on a similar machine and package up just those parts that we need to run the Analysis Tool. (And which parts are those?) This package could then be distributed to the Grid nodes either via LIGOtools or some other mechanism.
Looking over the working versions at Hanford and Spy Hill, it seem that they are based on using a mixture of LIGOtools for the DMT macros and GDS for the dynamics libraries. Eric has worked through separating the two cleanly so that the tool can be run just by using GDS, without depending on LIGOtools at all.
- New LIGOtools packages: Eric has recently found a way to combine the two paths above. He is packaging up new (or newer) versions of the DMT macros and GDS libraries into tarballs which are valid LIGOtools packages, which can then be easily installed on any remote machine (Argonne server machine or remote Grid node).
In case anybody is interested, LIGO analsysis software development is coordinated via DASWG, the Data Analysis Software Working Group. Their web page is at http://www.lsc-group.phys.uwm.edu/daswg/
You will find there links for DMT, LDR, and LIGOtools.
Here are the directory structures needed for the Analysis Tool and LIGO data. The locations of some of these directories have been changed between v0.35 and v0.40, so be cautious
The reasons for having an html subdirectory are that it allows us to segregate code by major function, and it makes coexistence with the BOINC forum and MediaWiki glossary code easier to manage, because each major component is in a separate subdirectory. These will be detailed elswhere.
tla/root - ROOT and shells scripts to run an analysis. Doesn't have to be served by httpd.
tla/bin - supporting scripts to set up data or build RDS, or to monitor data flow. These are a mix of PHP, Perl and Bourne shell scripts. They do not need to be served by the web server. They have nothing to do with the web interface.
tla/html/tla contains the PHP pages which drive the Analysis Tool. The web server needs to be configured to serve the Analysis Tool from this directory, and the web server needs to be configured to use PHP for server side scripting. Under this we have some minor subdirectories:
tla/html/tla/auth - supports HTTP Basic Authentication to the site (for "guest" logins)
tla/html/tla/img - image files
tla/html/include - supporting functions to interface with BOINC web software
(RIght now the Bluestone code is more or less all in one directory,
, but the files themselves are (or should be) separated into two major types. One set of files deals with presentation - output of the web pages. The other set of files provides supporting functions, but does not actually construct individual pages. In the future we might move the second set of files to a separate directory (which is not served by the web server). This is how the BOINC code is segregated, but there was no need for it when Bluestone was first started, and we'd only want to do this once the code grows to where this would be important for improving support and further development.)
tla/html/tla/slot - a working directory for analyses. Individual unique subdirectories are created for each web session. They need to be cleaned up after a few days of last use (I have a cron job which does this). The directories under the slot directory have to be servable by the web server, since each contains a user's interactive session. The main
html/tla/slot directory has to be writeable by the webserver (group 'apache' on Red Hat/Fedora, group 'www-data' on Debian/Ubuntu) so that the web server can create the slot subdirectories. Those will then be owned by the web server, and you won't be able to see inside them unless your user account is in the same group, or you are root.
TLA_ROOT_DIR is where our own scripts and macros live (this is
tla/root). This is not
ROOTSYS, nor even a subdirectory of it. When used with ROOT, it should come before
ROOTSYS in any macro paths (not that ROOT seems to use paths to load macros). We can put our own version of scripts and macros here and they will be found first in the path and thus override the default versions from ROOTSYS or from LIGOTOOLS. (Earlier versions of the Analysis Tool code called this simply
LIGOTOOLS is where we install packages from LIGOtools. Under this,
$LIGOTOOLS/packages/dmtroot/active/macros/ is where the dmtroot macros are kept. (Eric is trying to change this to use the macros from a GDS build rather than from LIGOtools.)
ROOTYSYS is where the ROOT executable lives, along with ROOT macros from CERN. This is the "system" version of ROOT, unmodified from the CERN distributtion. It could be the very old version distributed with LIGOtools, or more likely a separate install of ROOT v4.04. Initial attempts to use ROOT v5 were unsuccesful.
DMT_ROOT_MACROS is the directory containing the ROOT macros loaded by your analysis script. It currently points to
$LIGOTOOLS/packages/dmtroot/active/macros/ but in the future it can point somewhere else without needing LIGOtools. Early versions of the analysis tool called this simply
DMT_ROOT_LIBS is where the dynamic ROOT libraries are kept. This is currently something like
/opt/lscsoft/gds/lib, but could also be a list of paths to
rootlib subdirectories under $LIGOTOOLS.
DATA_DIR is the top level directory for our Reduced Data Set (see LIGO Storage Requirments). On tekoa this is simply the
/data partition. At Argonne it is
/disks1/myers/data. The LIGO data are in a subdirectory of this called
ligo, which will allow also for non-LIGO data in the future. For example, NOAA buoy data would be kept in
$DATA_DIR/noaa/historical/stdmet/ to preserve the lower level directory structure set up by NOAA.
LD_LIBRARY_PATH is where ROOT loads dynamical (shared object) libraries. It uses this as a load path even if you specify a full path in your ROOT script (this is likely for security reasons, but it's confusing if you don't know about it). So we simply insert
LD_LIBRARY_PATH. If, however, we get the shared libraries from a LIGOtools installation then the libraries from separate packages are in separate subdirectories under
$LIGOTOOLS, and each of these needs to be included in
Code in CVS
- The code for Bluestone is available via CVS from Spy-Hill:
cvs -d :pserver:email@example.com:/usr/local/cvsroot/i2u2 checkout ligo/tla
- The HEAD version may or may not be unstable. It is the "development" release (or pre-release). We also have a "test" release, which is stable but may have bugs, and a "production" release, which has been tested and is the most stable. We always want at least one instance of the stable "production" version running somewhere for teachers to use with no reservations.
- The Bluestone code base is now also being checked into the I2U2 SVN repository so that it can be more easily depoloyed and managed as a part of the collection of I2U2 tools. In general, Eric works on Bluestone and checks in changes to his CVS, then tags and releases a new version when it is ready, then checks that new release into the SVN. As a result, the version in SVN should be stable and usable for I2U2. Unless you are working on Bluestone development, that should be enough. If you must have the lastest development version, then use the head from CVS.