Incident History

Disruption with Copilot Coding Agent sessions

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1773885959 - 1773888764 Resolved

Disruption with some GitHub services

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1773873375 - 1773884641 Resolved

Webhook delivery is delayed

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1773859860 - 1773863198 Resolved

Incident With Webhooks

On March 10, 2026, between 23:00 UTC and 23:40 UTC, the Webhooks service was degraded and ~6% of users experienced intermittent errors when accessing webhook delivery history, retrying webhook deliveries, and listing webhooks via the UI and API. Approximately 0.37% of requests resulted in errors, while at peak 0.5% of requests resulted in errors.

This was due to unhealthy infrastructure. We mitigated the incident by redeploying affected services, after which service health returned to normal.

We are working to improve detection of unhealthy infrastructure and strengthen service safeguards to reduce time to detect and mitigate similar issues in the future.

1773849889 - 1773849889 Resolved

Errors starting and connecting to Codespaces

On 16 March 2026, between 14:16 UTC and 15:18 UTC, Codespaces users encountered a download failure error message when starting newly created or resumed codespaces. At peak, 96% of the created or resumed codespaces were impacted. Active codespaces with a running VSCode environment were not affected. The error was a result of an API deployment issue with our VS Code remote experience dependency and was resolved by rolling back that deployment. We are working with our partners to reduce our incident engagement time, improve early detection before they impact our customers, and ensure safe rollout of similar changes in the future.

1773673266 - 1773674902 Resolved

Degraded performance for various services

On March 13, 2026, between 13:35 UTC and 16:02 UTC, a configuration change to an internal authorization service reduced its processing capacity below what was needed during peak traffic. This caused intermittent timeouts when other GitHub services checked user permissions, resulting in four to five waves of errors over roughly two hours and forty minutes. In total, 0.4% of users were denied access to actions they were authorized to perform. The root cause was a resource right-sizing change deployed to the authorization service the previous day. It reduced CPU allocation below what was required at peak, causing the service's network gateway to throttle under load. Because the change was deployed after peak traffic on March 12, the reduced capacity wasn't surfaced until the next day's peak. The incident was mitigated by manually scaling up the authorization service and reverting the configuration change. To prevent recurrence, we are adding further resource utilization monitors across our entire stack to detect throttling and improving error handling so transient infrastructure timeouts are distinguished from authorization failures, enabling quicker detection of the root issue.

1773414768 - 1773418533 Resolved

Degraded Codespaces experience

On March 12, 2026, between 01:00 UTC and 18:53 UTC, users saw failures downloading extensions within created or resumed codespaces. Users would see an error when attempting to use an extension within VS Code. Active codespaces with extensions already downloaded were not impacted.The extensions download failures were the result of a change introduced in our extension dependency and was resolved by updating the configuration of how those changes affect requests from Codespaces. We are enhancing observability and alerting of critical issues within regular codespace operations to better detect and mitigate similar issues in the future.

1773320815 - 1773341613 Resolved

Actions failures to download (401 Unauthorized)

On March 12, 2026 between 02:30 and 06:02 UTC some GitHub Apps were unable to mint server to server tokens, resulting in 401 Unauthorized errors. During the outage window, ~1.3% of requests resulted in 401 errors incorrectly. This manifested in GitHub Actions jobs failing to download tarballs, as well as failing to mint fine-grained tokens. During this period, approximately 5% of Actions jobs were impacted The root cause was a failure with the authentication service’s token cache layer, a newly created secondary cache layer backed by Redis – caused by Kubernetes control plane instability, leading to an inability to read certain tokens which resulted in 401 errors. The mitigation was to fallback reads to the primary cache layer backed by mysql. As permanent mitigations, we have made changes to how we deploy redis to not rely on the Kubernetes control plane and maintain service availability during similar failure modes. We also improved alerting to reduce overall impact time from similar failures.

1773290780 - 1773295327 Resolved

Disruption with some GitHub services

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1773280458 - 1773283555 Resolved

Incident with API Requests

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1773239851 - 1773241343 Resolved
⮜ Previous