Dear Dediserve London users,
You may have noticed service being impacted on Saturday 26th of January, for which we sincerely apologize. Below is the formal RFO for the incident, please rest assured we are taking all reasonable measures to make sure this does not re-occur, thank you for your understanding.
What was the cause?:
A faulty PDU caused damage to core switching, interrupting connectivity to VMs routed through that stack.
What was the fix?:
The faulty PDU and core switch stack had to be replaced, replacing the PDU and re-configuring the switch stack was time-intensive and contributed to the length of downtime.
What was the impact?:
A notable number of VMs lost connectivity whilst the switch stack and PDU was replaced, a number also required reboot to clear their ARP cache to restore connectivity following the hardware change.
Will the issue re-occur?:
There should be no reason this issue would re-occur, we are performing staggered upgrades to power equipment in this site to better protect against any future occurrence.
As always, if you have any questions, please let us know.