Apologies for absence
There were no apologies.
minutes of the last meeting.
These were accepted.
Report from Computing Executive Group
This agenda item was not taken.
Reports from units.
The new KDC master, barrett, is up and working. The new LDAP master server hardware, a PowerEdge 2950 hostname franklin, has almost been configured.
There are about four places in which the kerberos ticket renewal time is set and the minimum of these values is then the one used for configuration. The current settings are such that tickets can't in fact be renewed. The global value for the renewal time will be modified to be one month. The renewal time for tickets associated with user principles will be unchanged (so that by default they can't be renewed) but a script will be provided so that the renewal time can be set to a value that would enable long running jobs to get the ticket renewed up to some maximum length of time (to be decided). The renewal time for AFS principles will also need to be increased.
All console servers have now been upgraded to FC5. All the Dell GX110 machines have been replaced and we are now using 3 Dell PE750s and 3 HP D530s.
George has placed the most recently estimated dates for the Appleton Tower building developments on the wiki page for AT building news. Following the release of each refurbished floor the Infrastructure Unit and the technicians will install the switches and connect up all the patch cables. All VLANs currently used in Appleton Tower will be available on the new floors, so it will remain for the User Support Unit to configure the switches for each desktop that is moved from one floor to another so that a machine is placed on the same VLAN after the move as it was on prior to the move. Ken will check with Gordon Duckett about the physical move of equipment between the floors.
George asked that as many people as possible look at the flip-desk systems that are in AT-5cln. Alastair suggested that he and Ken speak to Gordon and Stuart about the flip desks.
The air conditioning unit in the Buccleuch Place server room has been leaking condensate. Works Department staff have examined the air-conditioning unit and found nothing unusual but the actual condensate drainage pipe was observed to be incorrectly installed; it was going uphill so the condensate was not able to drain away along the pipe before the level of condensate reached the source of the leak. Work was done on correcting the installation of the drainage pipe and it will be regularly checked to see whether that has solved the problem. It is unclear why it is several years after the installation that leaks are being noticed. We should avoid putting kit into the Buccleuch Place server room unless it has to be in Buccleuch Place. We should also consider installing another rack in the Appleton Tower server room.
Craig mentioned that the temperature in the Appleton Tower server room had recently been 24°C. A fan that had been installed to extract air away from the area into which the air-conditioning outlet ducts expelled their air had not been connected to the mains electricity supply by the contractor; the Works Department have been informed.
Stephen reported that there had been a couple of problems with the latest kernel. The Nvidia driver had failed to build correctly and had caused problems for machines that required that kernel module. A replacement driver had to be found. This underlined the fact that up until now we have not really been doing enough testing of the kernel and kernel modules prior to shipping them in a stable release. It is the intention to start to do checks on all the kernel modules that we use prior to shipping. Help from other units will be required to check kernel modules that they use.
The cron component resources have been changed so that we will allow all users to use the at command by default.
Alastair has obtained a copy of MacOS 10.5 (due to be released in October). Chris has started on a wiki page on the subject and will announce this soon. Alastair will check the licensing details to see who can install it.
The old LCFG header files that were edited using rfe are no longer being used by any machines. They will be archived off and then deleted.
Research and Teaching.
The ITO are a little concerned about a few issues, especially as they approach the time of Boards of Examiners and the need to generate BOE reports from the School database.
The recent crashes of the School database server are now repeatable and it would appear to be linked to the axnet interface software again. This will be investigated.
The samba service has been having problems just recently and the Services Unit are looking into this.
The Deputy ITO Administration Manager has been having problems with eXceed X server on his machine. The X server keeps on resetting itself and killing off all the X windows.
Tim has contacted teaching staff to draw to their attention the information we have about required teaching software and asking for any updates prior to the unit starting to port teaching software to FC6 in the next few weeks.
Iain has been having problems with the lion beowulf cluster. When the nodes have been heavily loaded the kernel has been killing off slapd making it impossible to connect to the node and sort out the situation other than via the console port. Things that are being considered and which the meeting asked to be considered are using an external LDAP server or a local dbm file for holding user and group information, increasing swap space and setting a limit on memory usage of individual jobs via ulimit.
Craig reported that there the recent problems with the Appleton Tower print server had not been seen following the action taken a fortnight ago. He said that the unit would need to consider whether or not to move to using CUPS instead of LPRng. Stephen mentioned that LPRng had failed to compile on FC6.
IS have now fixed the problem with the support for the + notation used by sendmail. Although the staffmail mail server itself had always supported it, the mail relays had not. However the mail relays do now support this feature.
At 20:10 on Tuesday 17th April there was a power failure and one of the circuits in the Buccleuch Place server room tripped out and did not come back. This was not noticed until early the following morning and all affected services were brought back by shortly after 09:00. Although pegasus remained up throughout this incident the power had been lost to the fibre channel switch and so access to the fibre-attached disk space was lost for the period of the interruption. Hippocampus had both of its dual power supplies erroneously connected to the same UPS, which was on the tripped circuit, so it shut itself down. Craig has subsequently done a revue of the arrangements in the BP server room and corrected this error.
The Buccleuch Place ATABeast hung for an hour on the afternoon of Thursday 19th April. It was necessary to power cycle it; the log file has been forwarded to Nexsan.
We are seeing problems with the samba service, especially the smb.admin server. The problems seem to have started after the most stable release came out last Thursday. The unit are investigating the problems. the symptoms are that users can't connect to shares when they log on and if they are already logged on then those shares are locked up.
Since the last Operational Meeting the User Support Unit had handled 158 new RT tickets (equivalent to about 16 per working day) and resolved 60% of them. There had been a total of 177 tickets (including both new and existing tickets) resolved over the same period.
In the next few days we intend to start using the scripts for warning users of the intention to delete their old computer account.
Ken has done more work investigating a possible new inventory system. He has made further changes to the item table in the copy of the School database and can now load all the data from the new XML format orders into the database. He has created modified versions of the old custom forms for viewing this data via the TEC GUI.
Alison and Lindsey have added a new custom field in RT for the support queue and pruned out some old unused values from the other fields used by all queues. Ken encouraged other units to consider creating additional custom fields for their queue and to contact Alison about this.
There was a brief discussion about how likely it was that we would be introducing an AFS Windows client for MDP users. It still appears that we are not in a state to do this with any enthusiasm.
All but 6 of the FC5 lab machines have now been rebooted and picked up the new kernel. Almost 90% of all office FC5 machines have also been rebooted and on some sites it is 98% or more. About 85% of the User Support Unit servers had been rebooted.
Ken mentioned that there had been problems using the at4 printer from the MDP machines. Craig said that the database that samba maintained for that printer was corrupted (samba keeps a database for each printer that it is configured to support). Ken asked that documentation on how to diagnose that as a cause of a printing problem and how to remedy the situation should be published so that Front Line Support could fix the problem if it arises again.
There was none.
Please contact us with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh