Authentication issues on DUO61
Incident Report for Duo
Postmortem

Summary

On December 6, 2023, at around 19:37 EST, Duo's Engineering Team was alerted by monitoring that DUO61 was experiencing network egress downtime. The root cause was maintenance that caused a change in networking configuration. 

The issue was resolved by reverting our networking change and following we were able to observe that there was minimal customer authentication downtime. 

Deployments Impacted

  • DUO61

Timeline of Events EST

2023-12-06 

19:21 Duo Site Reliability Engineering (SRE) completed the scheduled maintenance which deployed an unexpected configuration change in our networking.

19:22 Duo SRE detects that network traffic is down for DUO61

19:22 Duo SRE starts investigation, hampered by the network changes

20:02 Duo SRE rolls back changes to networking that caused the issue

20:18 Duo SRE starts to see recovery for DUO61

20:25 Duo SRE confirms full recovery in the deployment

Details

Duo SRE completes maintenance work without expected downtime in off peak times for a deployment. DUO61 had some off peak maintenance work scheduled to update and harden the security posture of the deployment by updating our load balancer layer. During the rollout a configuration error was rolled out to the deployment causing network egress traffic to drop.

By rolling back the configuration, Duo SRE was able to restore traffic out from the deployment. We then were able to see that authentication traffic had minimal impact from the outage.

Duo SRE is addressing the individual bug that caused the incorrect configuration to be rolled out to the egress layer. In addition Duo SRE is currently in the process of re-architecting the egress layer to be more resilient during these types of events. 

Note: You can find your Duo deployment’s ID and sign up for updates via the StatusPage by following the instructions in this knowledge base article.

Posted Dec 07, 2023 - 19:15 EST

Resolved
Following a monitoring period, we have confirmed that the authentication issues impacting DUO61 are now fully resolved.
We will be posting a root-cause analysis (RCA) to this incident as soon as it is available.
Posted Dec 06, 2023 - 21:07 EST
Monitoring
We have applied a fix and are starting to see a recovery for the authentication issues on DUO61.

We are continuing to monitor for full service recovery.
Posted Dec 06, 2023 - 20:30 EST
Investigating
We are currently investigating an issue affecting authentication and Admin Panel logins on DUO61 and are working to correct the issue as soon as possible. Please check back here or subscribe to updates for any changes.
Posted Dec 06, 2023 - 19:55 EST
This incident affected: DUO61 (Core Authentication Service, Admin Panel).