Quote:
Originally Posted by photon
It could have been a change, just one with a really bad payload. Then when the BGP update starts knocking all the systems offline you don't have access anymore to be able to roll back. And maybe the people with physical access didn't have the necessary level of access to roll back the changes, or something else with their design prevented an easy rollback in that specific failure mode.
I've had to write RCAs for failures that were a perfect storm of unusual circumstances before, though usually isolated to a single system. I agree would love to know the details, I assume that we'll get some level of explanation at some point.
|
Beyond an internal Rogers RCA, I imagine there’ll be clients with muscle (enterprise etc), demanding to see it. Really, due to the scale the public should be privy too.