The e-Labs website
and e-Labs are served from three SuperMicro servers located at Notre Dame's Center for Research Computing
. The servers were purchased through Fermilab in July 2014 and stored there until they were moved to Notre Dame and placed into production in Q4 2015 when we moved IT operations from Argonne to Notre Dame.
As labeled by the CRC, the physical ("bare metal") servers are
are ostensibly identical general-purpose servers. i2u2-store01
is the primary data storage server, with 28TB.
The CRC has installed Red Hat Enterprise Linux on these servers, from which their resources are apportioned into several virtual machines (VMs) described below. All interaction that e-Lab developers have with the servers is in terms of the VMs, so you typically don't need to know the physical machine names unless something goes wrong.
Something goes wrong:
In December 2016 two of the drives failed on i2u2-vmhost02
along with its power supply unit. CRC engineers warned us that the failure of an additional drive would wipe out the VMs stored on that server. These VMs were moved to i2u2-vmhost01
(with reduced RAM) for safety until i2u2-vmhost02
can be repaired. The servers were still under a 3-year parts warranty from SuperMicro.
The CRC handled the warranty submission to SuperMicro, which shipped replacement parts. The server was repaired by the end of January 2017, and the affected VMs were returned to normal service over the following week (i2u2-data
, being critical for the website function, had to wait until the next weekend to be moved).
The engineers recommend against purchasing SuperMicro equipment in the future, since their products tend to be not as robust as "Tier 1" equipment.
All e-Lab IT functions are performed on virtual machines (VMs) created on the two i2u2-vmhost
bare-metal servers listed above. The CRC is in charge of the virtualization software running on the underlying RHEL OS, so contact them if you need a new VM or a clone or something.
The VMs are
More details on the VMs
|| Public IP
|| Server for e-Labs site www.i2u2.org
|| For development prior to deployment on i2u2-prod
|| Database server for user data to i2u2-prod / dev
|| Database server for physics data to i2u2-prod / dev
|| Server for quarknet.i2u2.org
|| Server for wiki.i2u2.org (this one) and bugzilla.i2u2.org
|| Temporary server to help fix a problem with LIGO in 2016. Since deleted
|| Jupyter Notebook server
Obtaining access to the VMs
The drives on all physical servers are kept in a RAID array as a first measure against data loss.
The VMs themselves are backed up nightly to tape.
Maintenance and Security
The VM's need to have their packages updated regularly using
in order to stay secure. After
, a restart may be required.
Naturally, you want to avoid restarting public VM's while users are logged in. In either i2u2-prod
(www.i2u2.org) or i2u2-dev
, you can login as the administrator and select "Session Tracking" to see who's currently logged in. It's typical to have many users logged into the Cosmic Ray e-Lab on i2u2-prod
, for example.
Once SSH'd into the VM itself, you can also check the Tomcat logs at
to see who's doing what on the website. The terminal commands
are also useful to see who else is logged into the VM directly and what they're doing (this should only be other developers or sysadmins).
Restarts to i2u2-prod
, and i2u2-db
are best done at night (and preferably over the weekend) to avoid disrupting users.
We maintain Let's Encrypt SSL certificates for www.i2u2.org on i2u2-prod
and for bugzilla.quarknet.org and wiki.quarknet.org on i2u2-wiki
. The CRC maintains SSL certs for crc.nd.edu domains (e.g. i2u2-dev.crc.nd.edu) on servers that we don't serve anything under our own domains on.
ELabs SSL Certificates
Confused by references to servers you've never heard of, like www18, www13, or data4? These were the names of servers we used when the e-Labs site was served from Argonne. Learn More
Unknown MySQL server host 'i2u2-db.crc.nd.edu'
This can happen when an e-Lab or CIMA attempts to pull data from i2u2-db
, and it's generally a DNS resolution error (that is, the calling VM can't turn "i2u2-db.crc.nd.edu" into an IP address). First do the obvious cursory checks:
- SSH into i2u2-db to make sure it's up and connected to the network
$ ping -c 5 i2u2-db.crc.nd.edu from the calling VMs (likely i2u2-prod and/or i2u2-dev)
Assuming the above checks are nominal, these solutions have worked:
- Restart Apache. If you can't restart Apache - for example, if there are active e-Lab users:
- Use the direct IP. In the relevant code, try replacing "i2u2-db.crc.nd.edu" with the IP address of i2u2-db as given on the VM info page. This is mostly useful with CIMA where you can make direct changes to the code without deployment.
- Look at
/etc/resolv.conf on the calling VM. On one occasion, this file became wrong, possibly after being overwritten by
resolvconf. Ask the CRC engineers to confirm that the current file is correct; as of March 2017 the correct configuration is given here.
If none of these work, you can try restarting MySQL using either of
$ sudo service mysql restart
$ sudo /etc/init.d/mysql restart
If that fails, reboot the calling VMs if possible.
-- Main.JoelG - 2016-12-07