White dot for spacing only
The Dice Project


Overview

The Infrastructure Unit operates the following general services across three sites: Informatics Forum (including Bayes), Appleton Tower, JCMB. Each site is set up so that it can operate as autonomously as possible, while at the same time providing redundant services to the other sites.

We use virtual machines where it is reasonable to do so. Where we do have physical machines this is either for performance reasons or to avoid dependencies during fault and cold-start situations.

Network infrastructure and services

Each site's ether switch configuration has been tailored to the particular circumstances. At the time of writing (June 2020) we have around 170 network switches in the Forum, around 80 switches in Appleton Tower, 14 switches in Wilkie Building, and 4 switches in JCMB.

The Forum also has pair of FibreChannel fabrics. FC is generally now being deprioritised.

The Forum and Appleton Tower each have four network infrastructure machines, as follows:

At JCMB we have only three machines: one combining the first three roles above, a second (VM) acting as external nameserver, and a third acting primarily as console server but set up so as to be able to take on the network roles if required.

Consoles

Each site has one "console server" machine, acting as a central point for all of the site's IPMI, KVM and Lantronix-serial consoles. In addition, they act as console server for each other; and in a few cases we have an off-site console server set up for critical machines.

Monitoring

A nagios monitoring service is integrated with our machine configuration system. We have two nagios machines:

The primary is a real machine, to minimise dependencies, and the secondary a VM.

Authentication services

We use kerberos for authentication. There is one master KDC, in the Forum, and one slave KDC at each site. The iFriend master KDC is in the Forum, with a slave in Appleton Tower, and an additional slave at KB (this one is non-operational and purely to provide a backup away from the central area)

Although we don't make a lot of use of kx509 at the moment, we still run a kx509 service. There are currently two KCAs, both on VMs.

Most web authentication now uses cosign. We currently have cosign servers (physical machines) in the Forum and Appleton Tower. These also co-host the iFriend KDCs. These services are not suitable for co-locating with the main KDCs for security reasons.

Hosts and services requiring a locally-signed X.509 certificate obtain this using the sixkts service. As this is not a high-availability requirement, we currently have one sixkts server on a VM.

Directory services

The OpenLDAP master is currently in the Forum, and there are site slaves in the Forum, Appleton Tower and JCMB. There are also several VM lightweight slave servers. These slaves provide all client/server traditional LDAP provision via sssd.

Account management (prometheus)

The prometheus system runs on a VM in the Forum. There is a read-only replica running on a VM in AT. A read-write version would not be an easy service to replicate, but immediate availability is not a requirement and we could move the service to another machine should the main one be unavailable. This could be either the existing replica, a development server or the JCMB OpenLDAP server (which is kept up to date nightly with prometheus data) or indeed another VM, as decided at the time.

Infrastructure Unit Kit Lists

Linux servers

Real machines, sorted by date and then hostname:
Name Type Location Role S/N P/O & date Warranty Replace UPS
(if non-building)
Comments
darwin R320 AT server room extDNS, extNTP CB3YC5J inf1748 2012-06-18 5Y With a suitable roll-down AT comms Formerly norrington
linnaeus R320 Forum server room extDNS, extNTP BB3YC5J inf1748 2012-06-18 5Y With a suitable roll-down   Formerly elder
rattle R320 Forum server room testloghost G1KL9X1 inf2540 2013-03-28 5Y     Replace with a VM at some point
knussen R320 Forum server room Retained as a hot spare 9CKKT02 inf3719 2014-03-18 5Y With a suitable roll-down 3kVA JS0511022795 Hot spare Forum / Bayes netInf
WARRANTY EXPIRED ABOVE HERE
curtis R220 JCMB server room JCMB KDC C4B3952 inf5359 2015-04-29 5Y 2020-21    
heaton R220 Forum server room KDC master 35B3952 inf5359 2015-04-29 5Y 2020-21    
moffat R220 AT server room AT KDC H2B3952 inf5359 2015-04-29 5Y 2020-21    
runnicles R320 AT server room AT extRt 9HJ8952 inf5382 2015-05-07 5Y 2020-21 AT comms  
campbell R320 AT server room LDAP site-slave D4J8952 INF5380 2015-05-07 5Y 2020-21    
nelson R320 Forum server room LDAP site-slave J2J8952 INF5380 2015-05-07 5Y 2020-21    
polly R320 Forum server room LDAP master 34J8952 INF5380 2015-05-07 5Y 2020-21    
Warranty extended by 1 year above here (RT#101544)
copernicus R330 AT server room loghost 4RTZGD2 inf6533 2016-06-13 5Y 2020-21 AT comms  
dutoit R330 Forum server room Forum netInf 4SQYGD2 inf6532 2016-06-13 5Y 2020-21 3kVA XL QS0348111013  
klaxon R330 Forum server room Nagios master DS6RGD2 inf6531 2016-06-13 5Y 2020-21    
klein R330 JCMB server room LDAP site-slave/prometheus DR DS5SGD2 inf6531 2016-06-13 5Y 2020-21    
maytals R330 Forum server room Forum KDC 5RM6DK2 inf7210 2017-06-02 5Y 2021-22??    
pappano R330 AT server room AT netServ, site DNS, OpenVPN, DHCP (static) 5RZ4DK2 inf7207 2017-06-02 5Y 2021-22?? AT comms  
pinnock R330 JCMB server room KB extRt, netInf, DHCP 5SJ6DK2 inf7208 2017-06-02 5Y 2021-22??    
sinopoli R330 JCMB server room KB consoles 5SC5DK2 inf7209 2017-06-12 5Y 2021-22??    
charmoz R330 Forum server room Forum consoles 3FQ6DP2 INF7975 2018-04-04 5Y 2022-23??   Replaces blatiere
hazlewood R330 Forum server room Forum netServ, Bayes netInf, site DNS, OpenVPN, DHCP (static) 2WBFDP2 INF7976 2018-04-04 5Y 2022-23?? 3kVA JS0510014805 Replaces rattle
hubley R330 AT server room AT cosign; ifriend KDC slave 3FQ8DP2 INF7973 2018-04-04 5Y 2022-23??   Replaces mcintyre
kaplan R330 Forum server room Forum cosign; ifriend web; ifriend KDC master 3FQ7DP2 INF7973 2018-04-04 5Y 2022-23??   Replaces hanlon
Machines below are outwith the current three-year kit planning period (F/Y 2020-23)
courtes R340 AT server room AT consoles 8CTJ2W2 INF9030 2019-03-12 5Y 2024-25??   Replaces grepon
oramo R340 AT server room AT netInf 2DTJ2W2 INF9041 2019-03-12 5Y 2024-25?? AT comms Replaces gatti
rilling R340 Forum server room Forum extRt 2GXJ2W2 INF9031 2019-03-12 5Y 2024-25?? 3kVA JS0511022795 Replaces knussen
harrison Meinberg M200 Forum 5A closet NTP S1 060211041290 INF9346 2019-06-21 3Y ?? TBD Replaces crystal and Acutime

Virtual machines, sorted loosely by functionality and name:
Name Location Role Comments
buchanan IF kca, misc  
dammers KB sixkts, kca  
dingaling AT Nagios secondary  
envat KB env.at.net Off-site; added for completeness
envif AT env.if.net Off-site; replaces function on dutoit
envkb AT env.kb.net Off-site; replaces function on pinnock
descartes KB Test RADIUS server  
euclid AT Test RADIUS server  
pythagoras IF Test RADIUS server  
handbag IF Wallet server  
huxley IF Test nameserver/timeserver  
wallace KB extDNS  
redding IF Prometheus master  
mitchell AT Prometheus read-only replica  
damflask IF LDAP lightweight slave  
hutter KB LDAP lightweight slave  
redmires IF LDAP lightweight slave  
schneider AT LDAP lightweight slave  

Machines in the Forum and AT server rooms are covered by the inbuilt UPSes, and are shown with a blank in the "UPS" column unless they have some additional provision. Machines in the JCMB server room may be powered by one of the "rack" UPSes, and in this case are shown with a blank in the column unless they have some additional provision.

UPSes

(Sorted by location and role, more or less...)
Name Type Role Location S/N P/O & date Rating Battery replaced Comments
atb0.ups.at.net
("AT comms")
Smart-UPS 3000 RM AT core network comms cabinet JS0510018447 a631778 2005-07-11 3kVA 2018-04 Formerly AT5
at6.ups.at.net Smart-UPS 3000 RM AT6 AT6 comms room JS0511022967 a631778 2005-07-11 3kVA 2014-10 Provides PoE for phones
at7.ups.at.net Smart-UPS 3000 RM AT7 AT7 comms room JS0714011678 ikb0141 2007-06-19 3kVA   Provides PoE for phones
at9.ups.at.net Smart-UPS 750 RM AT9 comms room Comms rack AS0444223639 a627391 2004-12-02 750VA ?? Essential services supply
core.ups.f.net SMART-UPS 3000 RM XL Forum core0 comms racks QS0348111013 a614789 2004-02-04 3kVA 2017-09  
netInf.ups.f.net Smart-UPS 3000 RM Forum core1 comms racks JS0511022795 a631778 2005-07-11 3kVA 2017-03  
netServ.ups.f.net Smart-UPS 3000 RM Forum core2 comms racks JS0510014805 a631753 2005-06-15 3kVA 2020-01  
(USB) Smart-UPS 1500 Test & development AT-7.06a YS0315121217 a609907 2003-08-14 1500VA 2013-03  
rack0.ups.kb.net Smart-UPS SRT 3000 JCMB server room Rack 0 [S]AS1852290560 SQ1575069 2019-08-20 3kVA    
rack1.ups.kb.net Smart-UPS 3000 RM JCMB server room Rack 1 JS0511022966 a631778 2005-07-11 3kVA 2016-07  
rack2.ups.kb.net Smart-UPS 3000 RM JCMB server room Rack 2 JS0510018437 a631778 2005-07-11 3kVA 2017-02  
rack3.ups.kb.net Smart-UPS 3000 RM JCMB server room Rack 3 JS0617023554 a637388 2006-06-27 3kVA 2014-10 Formerly AT4
(USB) Smart-UPS 1000 GPS clock and server Forum 5A closet AS0447230740 a631793 2005-07-11 1kVA 2018-04  

There's a separate table listing which batteries are required by our various UPS models.


IUkit.html,v 1.508 2020/06/30 08:54:50 gdmr Exp


 : Units : Infrastructure : Documentation 

Mini Informatics Logo - Link to Main Informatics Page
Please contact us with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh
Spacing Line