Apologies for absence
Alastair, Craig and Toby had sent their apologies.
minutes of the last meeting.
These were accepted.
Report from Computing Executive Group
This item was not taken.
Reports from units.
Ian reported that all the networking equipment had been installed on level 8 in Appleton Tower in anticipation of the occupation of the floor tomorrow. Stephen had installed two new servers in the new rack in the Appleton Tower server room (see Stephen's report below) and this had enabled the Infrastructure Unit to start monitoring the behaviour of the new networked power block that had been installed in that rack.
The primary and backup hardware for the monitoring system has now been installed in Appleton Tower and JCMB respectively. Simon has announced the availability of the monitoring service to computing staff and invited people to start converting their components to use it.
Simon has rewritten the cosign component to leave the configuring of the apache server to the apache component. This makes it easier to integrate it with iFriend.
Stephen investigated a couple of problems with the air-conditioning in the JCMB server room. The integrated air-conditioning, which blows cooled air through grills in the floor, does not appear to be moving a sufficient volume of air per unit time to handle the load. A backup air-conditioning system mounted in the ceiling is running permanently to make up the shortfall. However one of the two units comprising this backup system has failed. A heating engineer had said that certain components needed to be replaced but he has not reappeared to carry out the replacement of these parts. There is a 5°C temperature differential across the room.
Stephen reported that a long-standing bug in the LCFG server had been fixed. The bug had prevented one from having nested lists of resources beyond a list of lists. It is now possible to have nested lists to an arbitrary depth.
Stephen reported that two new machines had been installed in the new rack in the Appleton Tower server room: mousa, one of the two new servers to replace the LCFG slave servers and bressay, a new LCFG test server which will be used for testing the LCFG software. The new slave LCFG server can compile profiles twice as quickly as the old slave server, however it will not be brought into service until the second of the new slave LCFG servers is installed in the JCMB server room. Neil mentioned that a Dell PE 650 (arnie) which had a dead disk and was not being used at present could be moved out of its rack to make space for the new slave LCFG server.
They have added monitoring support to the rsync component and using Simon's modified ssh component they are now able to monitor whether or not all the MPU servers are contactable.
Chris and staff from EPCC have been working on the port of LCFG to Scientific Linux 5 and things are progressing well. The core LCFG components have all been ported already.
The newer patched 2.6.20 kernel for FC6 machines will be in the stable release this Thursday. Ken confirmed that they can ship the automatic reboot of machines framework to FC6 machines (currently only available for FC5). Ken said that he wouldn't be introducing the auto reboot facility for lab FC6 machines until the end of the month to avoid any possible disruption to MSc students running long experiments on lab machines.
Research and Teaching.
Tim submitted the following report after the meeting:
New firefox package should appear on stable machines at the end of this week (currently on develop) to sort printing problem. Actually it looks like Iain has held off doing this, possibly because he is on holiday this week.
Iain has found some possible solutions to the Condor/LDAP problems. There should be some ways to request memory jobs need when they are submitted and killing them if they exceed this amount, as well as killing jobs if they exceed some percentage of available memory. These will be tried out on the test cluster first.
Added Sicstus3 to FC6 distribution - in develop at moment, stable a week on Friday. This is in parallel with Sicstus4. This was needed because of some 3rd party research software and user code not working with Sicstus4.
Antlr (teaching package for FC6) is now available.
Neil reported that a 36GB disk drive had been added to the mail server nutty and he had moved all the service unit staff mailboxes onto the new disk together with the mailboxes of 179 former users. This had freed up a rather modest 3GB on the nearly full primary mailbox disk. However this was enough to solve the immediate problem of a dangerously full primary disk.
On checking with IS it turns out that this is a side-effect of an ongoing denial-of-service attack on the University mail relays.
A mono laser printer from the level 5 support office in Appleton Tower will be installed on level 8 as a temporary solution to the new occupants' printing needs.
When a reboot of har was initiated yesterday it failed to reboot because the inittab file was empty. Craig restored the inittab file from a backup copy on the disk and restarted har manually. However har then proceeded to reboot again. This morning it was noticed that afs was not working on har. It turned out that the afs package was not installed, so Neil reinstalled the package and restarted afs.
The status of the link between our machines (and specifically har) and hawthorn seems to have changed again (it had been completely severed recently). Neill will wait until he can speak to George about this before he pursues it any further.
Craig and Gordon have added the room data for bookable spaces on the new levels 3, 4 and 5 to the Appleton Tower shezhu room booking system. Craig and Gordon are assessing the usability of shezhu and the University calendar service for future room booking system.
Since the last Operational Meeting the User Support Unit had handled 161 new RT tickets (equivalent to about 16 per working day) and resolved 57% of them. There had been a total of 145 tickets (including both new and existing tickets) resolved over the same period.
Ken reported some figures showing resolution rates against age of recent RT tickets:
|Ticket Age D||Percentage Resolved (or rejected)|
|1 week < D < 2 weeks||70%|
|2 weeks < D < 1 month||80%|
|1 month < D < 3 months||91%|
The total number of accounts (not including temporary ones) which have an AFS home directory is now 181 (an increase of 13 in the last fortnight).
Ken reported that he had started to clean up the quotas file in preparation for analysis of the data for students with a view to rationalising the quotas for individual courses and years of study. Ken asked Neil about the behaviour of the '+' option when defining quotas for courses and it looks as if the behaviour of the quota generation software should ideally be changed so that they are completely additive.
Ken asked Neil about the need for documentation for fixing broken printer database files. Neil said that it should be sufficient now to just restart the printer component.
Ken has committed new versions of the accgen and update_users scripts to the cvs repository but has not yet built or shipped a new dice-accntmgr rpm. The new update_users script fixes a problem with the way it handled afs home directories. The new accgen supports creating either an NFS based home directory or an AFS based home directory and has an additional way of supplying the list of uuns to create the accounts for (namely from a file).
About 150 machines had been upgraded/installed with FC6 in the last two weeks. Ken presented the following figures for numbers of FC6 and FC5 machines:
There was none.
Please contact us with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh