White dot for spacing only
The Dice Project

(Inf logo) Operational Meeting: Infrastructure Unit Report
18th November 2020

  1. Server room access arrangements

    We have a wiki page which collects together the various documents relating to access to and working in the server rooms post-lockdown. Note that if you do need to go to the server rooms for any reason you MUST check the bookings wiki page to ensure that you don't conflict with anyone else, and particularly with any prior bookings; and you MUST record your visit on that page.

    There is a new version of IS's AT method statement. This version reinstates the removal of kit. Please read it before you next go to the AT server room. In short you have to notify them before you go to the AT server room, if you're removing something you have to tell them what (it's probably enough just to say something along the lines of "old servers from the Informatics racks to make space for new ones"), and to take everything you remove from the racks out of the server room.

  2. Power bar loads

    A reminder that the power-bar trap logs that we send nightly to cos need to be read and acted on by the relevant machines' managers. In particular, "overload" reports need to be take seriously, as they indicate that a breaker trip is very near.

    Note: the AT circuits are all already 32A, we have no spares with which to connect up additional bars, and getting any additional power provision added is not likely to be straightforward, or indeed affortable. The power budget for the AT racks is therefore effectively fixed as-is.

  3. EdLAN

    We reinstated 16.172.in-addr.arpa and 168.192.in-addr.arpa on DICE machines last Thursday morning. We then started to see reports from some machines regarding the serial numbers going backwards for them. Oddly, however, the datestamps for the log entries were quite a few days in the past. On further investigation these appeared to be desktop machines in the process of being reinstalled, so we are assuming that there is some feature of that process that causes a batch of old log entries to be sent to the loghost. (Example: jagua.) We haven't investigated further, though would just note that previous confusing examples of these log entries may have been from the same cause.

    The Fortinet certificates being presented occasionally are apparently due to a firmware bug. IS are trying to find mitigations. The IS alert is still open.

    We have an outline description document which is intended to summarise the way the new EdLAN is expected to look, various "thinking" documents linked from this index page, and a project to investigate our interaction with the new DDI. Comments and questions are always invited.

  4. OpenVPN

    As there seems to be an increasing tendency for IS to put "internal" servers on net-10 addresses, we have added some new OpenVPN configuration files to make these easier to use. They're named ...EdLAN+10..., and you can find them in the usual place linked from the OpenVPN computing.help pages. The iOS and Windows versions appear to work as expected, but please install and test the various platforms. We'll update the documentation in a few days, and probably write a blog article.

    (The configuration-generation process has been tweaked slightly again to make it easier to generate the Makefiles that generate the configurations.)

  5. Server room cooling

    The work on the District chilled water supply on Saturday 7th November highlighted the fact that the total air-conditioning in the small self-managed server room IF-B.01 is now only just sufficient to cool the equipment in that room - provided that nothing else goes wrong.

    We are talking to Estates about options to improve the cooling but, for now, we need to keep an eye on the use of that room, and we shouldn't install any additional equipment in it. (In the past, we have deliberately removed power-hungry equipment from the room in an attempt to ameliorate the same problem.)

    For the record: the current average total power consumption of the computing equipment in IF-B.01 is about 15 kW; and - we think - the cooling capacity in the room is rated at a total of 12 kW (being 9 kW for the Daikin, and 3 kW for the AHU.)

  6. Retiral of our Yubikey testbed

    Project 313, the 'Pilot service for Yubikey two-factor authentication'  (see the associated home page, blog site and final report) produced, among other things, several testbed servers and services, as follows:

    1. A canonical Cosign test site: https://canary.inf.ed.ac.uk
    2. A test ssh service which implements two-factor authentication via Yubikeys: quail.inf.ed.ac.uk
    3. A test Cosign server which supports two-factor authentication via Yubikeys: albatross.inf.ed.ac.uk a.k.a. https://webloginotptest.inf.ed.ac.uk

    At the time of the project, these were all produced under SL6 - but they were all later ported to SL7, and have been kept operational ever since in case they were needed for reference. Now, in the spirit of deleting VM guests which are no longer strictly required, we propose to completely remove all of these servers and services. Of course, the associated LCFG profiles will be retained (in archive form), but some hand configuration and knowledge will be lost.

    If anybody would prefer that any of this material were to be retained, please say so now.

  7. snort

    ... has been removed from all of the network servers and the loghost, though with the configurations still in place to allow it to be added again if we wanted to. Upgrading to snort3 is likely to take a bit of work, should we want to do that.

  8. Users with teaching-staff role

    The teaching-staff role was added in 2019 for Teaching Support Providers (TSPs). At the time we didn't want these accounts to be used (we think because of the prevalence of the 'submit' role), so we gave the teaching-staff role the 'prometheus/password/noportal' entitlement which stops the accounts being enabled. This has caused issues with users of these accounts not being able to set their passwords, and hence use these (legitimate) accounts. We believe this restriction is no longer required, so will remove the entitlement from the role, unless anyone indicates otherwise.

  9. Expiring certificates

    A reminder that we have some certificates which will expire at the end of the year (or, more accurately, the intermediate certificate authority used to sign them is due to be revoked). See Ops inf-unit report 2020-10-07. Of the certificates listed, rt4.inf.ed.ac.uk and issrt.inf.ed.ac.uk still need to be dealt with. The others have either been converted to Let's Encrypt, or are seemingly no longer active (dl.xrdp.inf.ed.ac.uk). Our strong preference is for the remaining sites to be converted to using Let's Encrypt certificates, although we can still obtain JANET certificates manually.

inf-unit-report.html,v 1.41 2020/11/17 16:34:34 gdmr Exp

 : Operational : Meetings 

Mini Informatics Logo - Link to Main Informatics Page
Please contact us with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh
Spacing Line