IDC Event Index
Here we tell you of day to day events as well as act as an index to other articles.
- 9 April AC unit still playing up. Needs some parts which is a little tricky during lockdown. Overall room environment not impacted with this unit out for service.
- 7 April One of the AC units playing up. It has been serviced.
- 26 March Some connectivity interrupted due to upstream DNS issues. These have now been resolved.
- 10 March Annual UPS checkout. All tested through okay.
- 9 March After a quiet February we have an issue for some e-mail users who are reporting certiifcate errors. We are looking into this now, it looks like some further updates are to be applied. Sorted later in the day. We are reviewing events surrounding this to prevent it happening again.
- 20 January 7:55 Multiple service events underway. We will have a full page ready when we can. It is an active situation which several staff and suppliers are managing. Click here for details.
- 6 January Today's lesson is how hard the wind can blow. Power supply to campus has been knocked about as a result, so IDC has been on and off generator throughout the day.
- 31 December Arghh! Some voice services upstream from us have flown to bits. We are awaiting complete updates but it appears something broke and engineers are going to site (a datacentre) with spares to carry out repairs.
- 20 December Major electrical work planned for IDC. Click here for details.
- 8 December National transit fibres were damaged following flooding south of Ashburton. Click here to read more.
- 11 November Annual backup generator check time, more involved than the monthly checks. Click here to read about what was involved.
- 15 October Power supply to campus cut. Backup systems did their job and all operations continued without issue. Supply returned after approximately 40 minutes. Lines company later reported a fault had caused supply interruption.
- 25 September As referred to in previous entry, we have been carrying out some firewall work at IDC. One of the firewalls is not delivering the throughput expected. We are working through this to get it to the expected performance level. As we operate dual firewalls, we can rapidly change between them as required. This frees the 'idle' one for maintenance.
- 24 September E-Mail services were interrupted for a couple of hours. And, hands up - our fault. We applied some configuration changes which affected e-mail handling. It shouldn't have, but it did. At the same time there was a rack-roll process underway and we were switching firewalls (yes a few things going on). Initially we thought the issues were firewall related so investigated there first. Further troubleshooting isolated the issue and led back to the cause.
Our apologies, it was one of simple human error. Once found we fixed the issue. We then contacted clients who had been in touch to see all was well.
- 23 September Annual inspection of Server Room UPS by engineer. Clean bill of health with only routine items noted.
- 7 September - A vulnerability was discovered within the E-Mail Server software we use. Fixes were produced and distributed. We updated our servers within an hour of the fixes being made available.
- 3 September - We experienced excessive e-mail lately; spam. This was traced back to a mailbox being compromised. Mitigation took a couple of hours to cover all the issues that arose. We urge passwords must be strong and hard to break.
We have reviewed the events and are adding further alerting. We are also putting together some wider monitoring to measure e-mail traffic flow. This will be drawn upon help identify pattern changes which may signal an event developing.
- 29 August 0830 - 1010 Call services were impacted following what appears to have been a fault upstream. Multiple providers were affected, Earthlight among them. It appears to have settled and we are monitoring. We will post further details as we get them.
Update 9 September - An upstream provider (two removed from us) has said they experienced significant increased volume of network traffic. They say they have implemented additional monitoring in case this kind of event occurs again.
- 19 July Pat on back time, with behind-the-scenes-work completed. Click here for details
- 15 July Brief bump upsteam from us around 0530 resulted in temporary loss of access to connectivity. It was restored 15 minutes later. A planned maintnenace is scheduled between 0500 and 0600 on Tuesday 16 July. The outage itself will be 10 - 15 minutes duration while equipment is changed over.
- 5 July Planned maintenance work for air conditioning in the Server Room has been completed. New units are in and running along with additional humidifier assets to control the environment.
- 2 July Upgrade work on air conditioning in the Server Room continues. An old unit has been retired and one of the two replacement wall units has been brought online.
- 30 June A handful of connections were 'wedged' (technical term) following upstream maintenance overnight. Our montoring sensed an issue. We traced the problem and remedied from there.
- 21 June A handful of sites dropped offline for 15 minutes 12:45 - 13:00. Root cause unknown following checking everything within our realm. We are continuing to monitor.
Maintenance planned for Saturday evening.
- 15 June Power to the campus went off for a time. Backup kicked in and everything remained online and protected. Things returned to normal after approximately three hours once supply was stable.
- 16 May Planned overnight maintenance was completed without issue. Air conditioning maintenance work is proceeding. One unit has been replaced. Work is underway to prepare for another to be retired.
- 10 May Reminder for network maintenance planned later this week. Some maintenance work is also planned to replace two air conditioning units in the Server Room. This will take place over the next week to ten days.
- 9 May Planned maintenance coming up on 15 May. Click here for details.
- 15 April Scheduled review of airconditioning systems. A couple of issues identified so further maintenance will be carried out, heading off problems developing.
- 13 April A brief power supply outage to the entire campus, lasting about 15 seconds on Saturday morning. Protection systems all worked as designed. UPS protected systems, generator kicked in and the site kept running. Once supply restored and was stable failback was carried out.
- 9 April Chorus are rolling out a series of updates across the network. This affects bascially all end connections. The work is being done in the early hours of the morning, any time between midnight and 06:00. You connection may go offline for up to 20 minutes while the work takes place. In most cases once the work has been done it will simply return online.
- 5 April Some server maintenance carried out overnight. Click here for details.
- 26 March Core network upgrades to be performed overnight. Click here for details.
- 24 March Service issues detected. Resolved in conjunction with upstream providers. Click here for details.
- 18 March Scheduled maintenance planned Click here for details.
- 14 March An authentication issue arose for a handful of connections. Once we became aware of the issue we were able to isolate the problem and remedy. We will review the events surrounding this occuring to prevent it happening again in the future.
Our review indentified the cause which was a badly formatted entry getting into one of the authentication areas. This effectively was a typo. Once identified the entry was corrected and all authentication returned to normal. Going forward we will continue to run checks after changes have been made to ensure system operations.
- 11 March An issue was detected by the UPS concerning the power supply to campus. While not a cut in power, the voltage and frequency were outside of tolerance. Accordingly the system moved off mains supply and started the generator. Changeover was automatic and staff notified as it proceeded. Zero interruption to operations. Change-back was about ten minutes later once incoming supply from the grid was deemed stable.
- 17 December Brief power outage in supply to campus. UPS provided interim protection while generator fired up to cover supply. While the outage was under 5 seconds, it took about 20 minutes for things to stabilise 100%. During this time the UPS monitored the mains supply to see it was stable before moving back and shutting the generator down. All systems worked as intended with zero service interruption. .
- 29 November Annual oil change and check of Genset for IDC is being scheduled. It will be done when the likelihood of supply issues is minimal, eg one fine calm day.
- 21 November One of our storage servers went unavailable which affected e-mail for a number of clients for about 45 minutes between 9:45 and 10:30. We changed over to another storage server which acts as a backup. Our apologies for any inconvenience while the changeover took place. During this time e-mail coming in continued to do so via our Christchurch operations. We have reviewed the server which failed and have not been able to isolate root cause. We already planned to migrate off that server and will make this a higher priority in case the root cause was hardware related.
- 20 November Overnight stormy weather resulted in power supply to IDC flapping.
UPS protection did its job and generator cranked up also. Very wet day - staff have left site early to make it back into town.
- 17 November Continued tidying of cabling following SUPERFLICK going in.
Secondary firewall moved across to SUPERFLICK.
- 13 November Installation of SUPERFLICK. Click here for details.