We were too quick in assuming that the AT basement power issues were fixed. Paul Hutton writes: "Apologies for the mass mailing. During the recent work on the backup electrical power system at the Appleton Tower datacentre, a further problem was identified with the electrical switching components. ..."
There'll be a total shutdown of the AT server room to get this fixed. It now looks as though this will be some time in the New Year.
Ian and Chris have been looking at the Dell iDRAC 8. The results are here.
REMINDER: if you install an SL7 machine on any of the "server" wires (S32 and S33 in the Forum, AT1 in Appleton Tower, S at JCMB) then you will also get an IPv6 address and IPv6 global routes. If you add that address to the DNS you will then get firewall holes for it. However, in contrast to the client wires, forward SLAAC-style DNS entries are not being automatically generated for machines on these wires. This has implications in both directions:
Perhaps the most important thing to beware of is assuming that any IPv4 access controls you have in place also apply to IPv6. They might. Then again, they might be totally independent and default to "open".
The new version of the problematic 54xx firmware which appeared on HPE's
site apparently should fix our issues, though the release note entry is
cryptic at best. It was installed on
core1 on Wednesday 7th,
and all seems to have been running properly since. TCP PSH preservation
has also been turned off on all of the other core switches, as well as on
the Forum server room edge switches. (It will default to off in the next new
firmware versions we install on them.)
Even so, we'll hold off doing any other core switch reboots until after the holidays, though the firmware has been uploaded to them all ready to go just in case.
From the release-note entry: "0000217339 - TCP - The HPE Provision switches prioritize received TCP packets with the PSH flag set by moving the packets to the head of the inbound port's processing queue. But due to increased levels of such packets in today's networks, the prioritized processing could potentially lead to head-of-line (HoL) blocking and subsequent dropping of inbound data packets. ..." That appears to include BPDUs, and once the spanning tree gets disrupted packets will start to be flooded, and the whole thing will just collapse from there. What they did in the K.16.02 firmware to break it isn't clear. Or perhaps we were teetering on the edge and something just tipped us over.
There was a power cut at KB on Wednesday 16th November. CHP restart apparently happened on 7th December.
The new CSR PoP switches have been brought into service. The necessary components for our own connection are on order, and once they arrrive we'll arrange to move our own links across: 1x 10GbaseSR bridged and 2x (or 3x) 1000baseT routed.
Version 2.4 has moved from beta to rc1. There are sufficiently many useful new things in this version that we're testing it on the DEV endpoint (and perhaps the DR endpoint soon). In particular, IPv6 support is now pretty much up to the same level as IPv4, and has worked as advertised in testing.
Unfortunately, setting up the daemon to listen on more than one
address doesn't work the hoped-for way. Essentially it can be made to
listen either on exactly one IPv4 or one IPv6
address or on a wildcard address. That latter
would allow us to listen on both IPv4 and IPv6, but requires kernel
support which first appeared in 3.15, so we'll just have to wait and
see whether RH have backported this when we finally come to upgrading
the endpoints to SL7. (We're currently waiting for
support to stabilise.)
Logging to the old loghost
tycho was dropped from last
week's <stable> release, prompting an
drop in traffic. We'll leave it to run for a few weeks to catch any
residual stray machines, and then turn it off.
Please contact us with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh