Summary:
From 3:05 to 3:22 UTC on March 2nd, 2019, the DUO55 deployment experienced performance degradation that resulted in increased authentication latency. The majority of authentications during this time were affected. The root cause of this outage has been identified and Duo’s engineering team is committed to reducing the overall impact of similar events going forward.
Details:
The DUO55 database tier is composed of multiple database clusters. Each cluster is composed of multiple sets of database hardware in several separate locations, configured in an active/standby setup.
At 3:05 UTC, one of DUO55’s active databases failed, preventing interactions with data contained within and causing authentication failures. Automated failover processes began configuring standby hardware to replace the failed database.
At 3:22 UTC, the failover was complete and service was completely restored. We will continue to explore additional infrastructure and software improvements to reduce failover time and customer impact moving forward.