![]() |
Apologies for absence
There were none.
minutes of the last meeting.
These were accepted.
Report from Computing Executive Group
Alastair apologised for not publishing the backlog of CEG minutes. He will try to do that tomorrow
Alastair and George had been given a quick 90 minute tour of the Informatics Forum by Mike. The overall impression was that it is a bigger building than one would think but that the server room appeared smaller than Alastair had imagined it to be from the plans. The building however is still very much a building site and is still quite hazardous. George, Dave and Alastair will be going on a safety course that will then allow them to guide others around the site. The partition walls are almost completed in some areas of the building but not in places such as the printer areas and the data communications closets.
George and Alastair will shortly be involved in meetings to discuss the exact location of the furniture on the assumption that the building will be fully occupied. This is so that the raised floor tiles containing the data communications ports and power outlets can be sited correctly relative to the furniture.
The meeting at which Mike will talk to computing staff about how they will be accommodated in the Forum is still to be arranged.
Reports from units.
Infrastructure.
George reported that we should be getting access to level 8 in the Appleton Tower within the next 48 hours or so. Dave Robertson and his group together with Colin Adams and some others from Buccleuch Place would be moving to that floor in the not too distant future (It was decided in another meeting this morning that the move would be on 9th August). Gilbert will be installing all the patch cables between the port terminations and the switches and then updating the switch configuration files. The phase 3 Appleton Tower basement alterations would be going out to tender in August. This covers work on constructing sound studios in the old plant room, construction of the technical workshops and modifications to the IS server room.
The air-conditioning unit in the Buccleuch Place server room appears to be functioning correctly now after a scavenger pump in the base of the unit had been replaced by a different and, reputedly, more reliable model of pump.
Toby will be starting work again on the LDAP caching project which had been stalled a few months ago because of the pressure of other projects. It is hoped that caching will provide a more robust LDAP service. It will still be possible to have individual machines running their own LDAP servers where this appears to be preferable. It would not be possible to send all LDAP queries directly to our LDAP master and slave servers as there are authentication issues and the load would be too great.
The master LDAP server franklin has had a disk in its RAID array giving warning messages for some time. Dell will not swap out the disk until certain standard tests are run. Alastair said he intended to contact Dell directly and discuss this matter in general with them, in order to try to clarify how such situations are handled in the future. In the meantime we will use another similar spare machine to run the tests on the disk (after installing the suspect disk in that machine).
Simon is about to do alpha and beta testing of his monitoring software. He would like anyone who has candidate machines and components to test to contact him.
The new Dell rack in the Appleton Tower server room is now ready to be populated with servers. We now have a supply of the cables for use between server console ports and the Lantronix box. Anybody wishing to install a machine in this rack should contact Ian. Any new servers should be installed in the lowest available space above the console server (leaving a 1U space between adjacent 1U height servers). We are still awaiting the arrival of a 10GBit link to connect the switch in this rack back to a switch in the communications rack. Meantime there's a 100Mbps backup link, and George will add a temporary 1Gbps link.
The load and outlet status of the new networked advanced power distribution block that has been installed in this rack can be viewed over the web at http://netmonat.inf.ed.ac.uk/cgi-bin/apc-pdu-status.cgi.
Everyone who installs kit in the AT server room should ensure that they place any unneeded packaging etc at the recognised locations for disposal of rubbish so that the cleaners can remove it.
Managed Platform.
Stephen reported that the unit had acquired two new servers to replace the LCFG slave servers. The new machines will be installed in the Appleton Tower and JCMB servers rooms. The old slave server hardware will be reused in a less critical role. They will be using slim for doing LCFG testing. The newer and faster slave LCFG servers should be capable of rebuilding profiles more quickly than the current servers can.
They are planning to upgrade to a newer patched 2.6.20 kernel on FC6 machines on Thursday 9th August. Ken needs to confirm that they can ship the automatic reboot of machines framework to FC6 machines (currently only available for FC5).
As a consequence of the fact that Fedora no longer support FC5 the unit are looking at how to track FC5 security bug alerts and how to then handle them. In some cases it may be necessary to upgrade some exposed servers (such as login servers and web servers) to FC6.
Chris and staff from EPCC are now starting work on porting LCFG to Scientific Linux 5. A Scientific Linux 5 project proposal is being brought to the Development Meeting next week.
Amazingly auto-detection of the HP L1740 flat panel monitors under FC6 is apparently now working since Stephen was unable to find a single example of the previously reported fault. A recent change in what is being shipped by Fedora must have fixed the problem.
Research and Teaching.
Tim reported that they were about to ship Java 1.5 with FC6 in parallel with Java 1.6; the latter is however still the default.
Iain has packaged up firefox 2 to ship with FC6 (it solves the printing problems people have noticed with the default version of firefox shipped with FC6, firefox 1.5.10). He hoped to be able to use the firefox component that IS produced, but it doesn't work, however it may just be a firefox 2 versus firefox 1.5 issue. Iain should, in any case, go ahead and ship firefox 2 for our FC6 machines.
A disk has failed in one of the Dell PE650 servers that the School had purchased for AIAI but which is now out of warranty. Alastair said that we should just go ahead and replace the disk
Iain has had some more cluster problems. Someone had run a large memory job on the head node and corrupted the GridEngine database. Alastair commented that we were probably using too little swap space on machines. Running out of swap space was a far more likely cause of problems than running out of memory which, in theory, should just cause timing problems at the worst.
Last week Craig and Tim experienced some very weird behaviour of the afs client on one of the machines. Some files in a directory were readable by Tim whilst attempts to read others in the same directory caused permission errors. Flushing the afs client's cache cured the problem. Simon has asked that the contents of the cache be preserved if someone comes across this again so that it can be debugged. Stephen suggested that the instructions for dumping the cache be published so that we can dump the cache ourselves if Simon is not available.
Services.
Craig reported that recently there had been delays in the delivery of mail to the IS mail server. On checking with IS it turns out that this is a side-effect of an ongoing denial-of-service attack on the University mail relays. Alastair will raise the fact that this information had not been actively passed onto School computing staff by IS at the next meeting of CCPAG (College Computing Professionals Advisory Group).
The disk containing mail folders on nutty is once again almost full. The unit will be installing an extra disk on a temporary basis and will move mailboxes associated with old accounts onto this additional (non-RAIDed) disk.
User Support.
Since the last Operational Meeting the User Support Unit had handled 149 new RT tickets (equivalent to about 15 per working day) and resolved 58% of them. There had been a total of 144 tickets (including both new and existing tickets) resolved over the same period.
Ken reported some figures showing resolution rates against age of recent RT tickets:
Ticket Age D | Percentage Resolved (or rejected) |
---|---|
1 week < D < 2 weeks | 71% |
2 weeks < D < 1 month | 80% |
1 month < D < 3 months | 92% |
The total number of accounts (not including temporary ones) which have an AFS home directory is now 168 (an increase of 15 in the last fortnight).
Ken would be updating some of the account management tools such as accgen during August so that we can efficiently create large numbers of accounts with AFS directories.
There are approximately 600 office machines (not including computing staff machines but including spares) to be upgraded to FC6 or replaced by new machines installed with FC6. We are planning to have about 80% of these done by the start of the next semester (round about 10th September). This implies that we need to install/upgrade about 480 office machines in just under 7 weeks which is at an average rate of between 70 and 80 machines per week. We have so far installed about 18 office machines for non-computing staff/students and about 10 spares. It is clear that we will have to ramp up this work in order to meet the target.
AOCB
There was none.
Please contact us with any
comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh |
![]() |