All Deployments - Entra 504 Gateway Timeouts

Incident Report for Duo

Postmortem

Summary

On February 23, 2026, US-based users utilizing the Duo integration with Azure Identity management experienced increased authentication failures and timeouts. These errors were occurring specifically when accessing applications using Microsoft Entra and Duo as the MFA provider. Duo engineering was alerted to the issues via monitoring at 9:15 AM and promptly began investigating various factors that could have caused this error to occur, including checking the health of the different third-party services involved. Finally, the root cause was identified to be increased CPU usage resulting from a code change that added additional validation for security operations.

The incident was remediated by scaling the backend services that support this integration.

During the incident, updates were posted to the status page as new information became available. At the beginning of the incident, the impact of the issue was underestimated, and thus notifications were not set to go out along with these status updates. When the impact assessment changed, notifications should have been set to go out, but this was missed. Processes have been improved to help prevent this from happening in the future.

Timeline of Events

Time (in EST) 2/23/2026 Event
9:15 am Engineering begins investigating AzureAuth errors
9:27 am  Duo begins receiving reports of 504 gateway timeout errors for customers accessing applications using Microsoft Entra and Duo.  
10:02 am   First status page from Duo is put up regarding investigating the root cause. This status page is posted without notifications due to assumed scale of incident.  
11:10 am  Duo scales resources for the Azureauth US deployment. 
11:19 am   Errors begin to subside. 
11:24 am The incident is moved to monitoring. 
11:30 am  The incident is resolved. 

Root Cause(s)

The Duo team follows a regular and disciplined release process across all services. Changes are routinely deployed to improve functionality, security posture, and overall reliability.

All releases undergo comprehensive validation, including:

  • Unit testing
  • Integration testing
  • Frontend testing
  • Performance testing

This layered testing strategy is designed to identify functional defects, regressions, and performance impacts prior to production deployment.

In the days leading up to the incident, a software update was deployed to the backend service that supports the Duo integration with Azure Identity management. This update was part of a broader security improvement effort. The change successfully passed all standard validation processes and was not expected to introduce downtime or performance degradation.

During the Monday morning traffic peak, this backend service experienced elevated CPU utilization, resulting in service degradation. The following points all contributed to the service interruption:

  • Increased CPU Utilization Per Request

    • The deployed change approximately doubled CPU consumption per request within this service.
    • This materially increased overall compute demand during peak traffic.
  • Performance Test Coverage Gaps

    • Existing performance tests did not identify the increased CPU usage introduced by the change.
    • The specific workload characteristics that amplified CPU consumption under peak conditions were not fully represented in pre-production testing.
  • Monitoring Gaps

    • A recent migration of our monitoring services missed CPU monitors for this service, leading to an alerting gap when CPU consumption started trending upward.
    • Alerts were triggered only after peak load conditions were reached.

Remediation and Future Work

To address all the causes for the incident, Duo is committed to doing the following:

  • Addressing increased CPU Utilization per Request

    • This backend service has been scaled up (Completed)

      • Additional resources have been added to the infrastructure powering this backend service to provide additional compute headroom.
      • This ensures sufficient capacity to absorb higher per-request of CPU costs.
    • Connection Efficiency Improvements (In Progress)

      • Investigate using persistent connection pooling, and/or caching mechanism to reduce connection overhead and lower per-request for CPU consumption.
    • Graceful Degradation Enhancements (In Progress)

      • Improvements being explored to ensure systems degrade gracefully under high CPU load, with the goal of protecting critical authentication flows and reducing user impact during resource saturation events.
  • Addressing Performance Testing Coverage Gaps

    • Expanded Performance Test Coverage (In Progress)

      • Performance testing will be updated to explicitly model this failure scenario.
      • Peak-load and stress conditions will be more accurately simulated.
    • Performance-Based Release Gates (In Progress)

      • Performance tests will become release-gating criteria.
      • Changes that materially impact resource consumption will be identified before production deployment.
  • Addressing Monitoring Gaps

    • Capacity Monitoring Enhancement (Completed)

      • Missing capacity monitors (inadvertently omitted during a recent migration) have been restored.
      • Monitoring coverage for this service has been expanded.
    • Comprehensive Monitoring Audit (Completed)

      • Full audit of all capacity monitors underway to ensure complete monitoring coverage and eliminate similar gaps across services.
  • Customer Communication Improvements

    • Improved Status Page Notifications

      • Status page processes have been updated to ensure faster and clearer communication.
      • Commitment to timely, transparent, and regular updates to all customers during future incidents, with notifications.
Posted Feb 25, 2026 - 10:04 EST

Resolved

The issue causing 504 gateway timeouts during Microsoft Entra logins when Duo MFA was called has been fully resolved. Authentication services are operating normally at this time.

We will share a root cause analysis once it becomes available.
Posted Feb 23, 2026 - 12:52 EST

Monitoring

We have deployed a fix for the gateway timeouts and are now seeing successful authentications. We are currently monitoring for any further issues.
Posted Feb 23, 2026 - 11:38 EST

Update

We are continuing to investigate an issue where 504 Gateway timeouts are present when accessing Entra when Duo is called for MFA. We are working with our respective partners to resolve as quickly as possible.
Posted Feb 23, 2026 - 11:20 EST

Investigating

We are currently investigating an issue where 504 Gateway timeouts are present when accessing Entra when Duo is called for MFA. We are working to resolve as quickly as possible.
Posted Feb 23, 2026 - 10:04 EST
This incident affected: DUO1 (Core Authentication Service), DUO2 (Core Authentication Service), DUO3 (Core Authentication Service), DUO4 (Core Authentication Service), DUO5 (Core Authentication Service), DUO7 (Core Authentication Service), DUO8 (Core Authentication Service), DUO47 (Core Authentication Service), DUO10 (Core Authentication Service), DUO11 (Core Authentication Service), DUO12 (Core Authentication Service), DUO13 (Core Authentication Service), DUO14 (Core Authentication Service), DUO15 (Core Authentication Service), DUO16 (Core Authentication Service), DUO17 (Core Authentication Service), DUO18 (Core Authentication Service), DUO19 (Core Authentication Service), DUO20 (Core Authentication Service), DUO21 (Core Authentication Service), DUO22 (Core Authentication Service), DUO23 (Core Authentication Service), DUO24 (Core Authentication Service), DUO25 (Core Authentication Service), DUO26 (Core Authentication Service), DUO27 (Core Authentication Service), DUO28 (Core Authentication Service), DUO29 (Core Authentication Service), DUO30 (Core Authentication Service), DUO31 (Core Authentication Service), DUO32 (Core Authentication Service), DUO33 (Core Authentication Service), DUO34 (Core Authentication Service), DUO36 (Core Authentication Service), DUO37 (Core Authentication Service), DUO38 (Core Authentication Service), DUO39 (Core Authentication Service), DUO40 (Core Authentication Service), DUO41 (Core Authentication Service), DUO42 (Core Authentication Service), DUO43 (Core Authentication Service), DUO44 (Core Authentication Service), DUO45 (Core Authentication Service), DUO46 (Core Authentication Service), DUO48 (Core Authentication Service), DUO9 (Core Authentication Service), DUO49 (Core Authentication Service), DUO50 (Core Authentication Service), DUO51 (Core Authentication Service), DUO52 (Core Authentication Service), DUO53 (Core Authentication Service), DUO54 (Core Authentication Service), DUO55 (Core Authentication Service), DUO56 (Core Authentication Service), DUO57 (Core Authentication Service), DUO58 (Core Authentication Service), DUO59 (Core Authentication Service), DUO60 (Core Authentication Service), DUO62 (Core Authentication Service), DUO63 (Core Authentication Service), DUO65 (Core Authentication Service), DUO66 (Core Authentication Service), DUO67 (Core Authentication Service), DUO68 (Core Authentication Service), DUO69 (Core Authentication Service), DUO70 (Core Authentication Service), DUO71 (Core Authentication Service), DUO72 (Core Authentication Service), DUO73 (Core Authentication Service), DUO74 (Core Authentication Service), DUO75 (Core Authentication Service), DUO76 (Core Authentication Service), DUO77 (Core Authentication Service), DUO78 (Core Authentication Service), DUO79 (Core Authentication Service), DUO80 (Core Authentication Service), DUO81 (Core Authentication Service), DUO6 (Core Authentication Service), and DUO35 (Core Authentication Service).