Archive for October 10th, 2017

Incident: cardiff.core

Posted: Tuesday, October 10th, 2017 at 18:08 by David Derrick

We have lost connectivity to our Cardiff router. Initial investigation suggests it is rebooting. We will continue to investigate and update when we know more.

  1. David Derrick says:

    The router has finished reloading and services are restored. We will continue to investigate the cause.

  2. Richard Partridge says:

    The cardiff.core router appears to have sustained another unexpected reload. Engineers are currently investigating. Further information will follow shortly.

  3. Richard Partridge says:

    Engineers are being dispatched to Cardiff to continue the investigation. Due to the number of recent failures, we are also planning to replace the core router whilst on site, which was originally planned for next week.

    Once engineers are on site and have assessed the situation, a notification outlining an emergency maintenance window will be published. The site should be considered at-risk until otherwise notified.

  4. Richard Partridge says:

    Engineers are on site and have concluded their initial investigation. Based on our findings we are currently proceeding with our plan to replace the core router this evening. A downtime window will be announced in due course.

  5. Richard Partridge says:

    We are planning to replace the Cardiff core router between 18:00 and 20:00 this evening. The core itself should be back online within 30 minutes during the window, however individual services will take longer to restore as they are migrated in turn from one core to the other.

  6. Jake Turner says:

    Further investigation on site within the maintenance window highlighted a fault with the power infrastructure that will be addressed by our colocation provider. We have taken measures to prevent further outages in the interim. As such the core router was not swapped.