Incident History

Elevated error rates across multiple services

On May 28, 2026, between 19:07 UTC and 19:16 UTC, multiple GitHub services experienced elevated error rates. This was due to a change that was partially deployed to an authentication service, causing errors for dependent services including the web experience, REST API, Git operations, and GitHub Actions. At peak impact, 10% of GitHub Actions runs failed to queue or encountered errors while downloading actions. We mitigated the incident by rolling back the change.

We are expanding test coverage and improving our deployment validation process to prevent recurrence of this issue in the future.

1780009836 - 1780009836 Resolved

Disruption with OpenAI Models

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1779994860 - 1780000918 Resolved

Incident with Copilot

On May 19, 2026, between 05:30 UTC and 14:50 UTC, some Copilot users experienced failures when using code completions, chat sessions, and cloud agent sessions. At peak impact, approximately 13% of Copilot API requests failed, and approximately 24% of remote sessions failed to initialize. A partial mitigation at 08:16 UTC reduced the Copilot API error rate to approximately 0.3%, but intermittent failures persisted until a full fix was deployed at 14:15 UTC and recovery was verified by 14:50 UTC.

The incident was caused by rate limits being exceeded on a shared infrastructure component. A recently enabled feature increased call volume to this component, and the combined load exceeded capacity limits as traffic increased during business hours.

We mitigated the incident by deploying a caching layer to reduce load on shared infrastructure. To prevent recurrence, we are separating rate limit scopes between services, adding monitoring for internal dependency rate limiting, and reducing redundant calls.

1779972000 - 1779972000 Resolved

Webhook APIs and UI Degraded

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1779930811 - 1779931947 Resolved

Incident with Pull Requests, Issues, Git Operations and API Requests

On May 27, 2026, between 12:07 UTC and 13:16 UTC, users experienced degraded performance for Git operations, Pull Requests, Issues, GraphQL API, and related services on github.com. During this time, operations that depended on Git file servers experienced elevated error rates (3.5% of pushes via HTTPS and 0.2% of pushes via SSH failed; no fetches/clones failed). An internal analytics component generated unexpectedly high load, which caused CPU saturation on the underlying infrastructure. This led to cascading slowdowns and errors across services that depend on Git operations. The issue was mitigated by stopping the offending component. Services began recovering shortly after mitigation and were fully restored by 13:16 UTC. We are taking steps to add resource limits and kill switches for internal analytics components to prevent similar issues in the future.

1779883810 - 1779887814 Resolved

Disruption with some GitHub services

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1779810262 - 1779813357 Resolved

Incident with Actions and Pages

On May 26, 2026, between 10:40 UTC and 12:56 UTC, GitHub Actions jobs were degraded. From 10:40 to 12:16 UTC, all newly queued Actions runs failed to start. From 12:16 to 12:56 UTC, Actions runs that required downloading actions for their workflows continued to fail. GitHub Pages, Copilot Code Review, Copilot coding agent, Octoshift, and GitHub Enterprise Importer were also impacted due to their dependency on Actions. This was caused by our automated account review system incorrectly suspending the service account used by GitHub Actions to authenticate workflow runs and download actions. We mitigated by restoring the account at 12:16 UTC, marking it exempt from further automated review at 12:20 UTC, and redeploying a related service at 12:48 UTC to flush cached account state. Full recovery was confirmed at 12:56 UTC. During this incident, a small number of Issues, PRs, Comments, and Discussions were marked as hidden when the service account was disabled. No data was lost. All content hidden because of this incident has been restored and full search index restoration is in progress. To prevent a recurrence, we have added an allowlist of all service accounts that cannot be suspended by automated systems, and ensuring these protections are enforced consistently across all account management tooling. We are also improving diagnostic tooling for accounts and reducing cache propagation delays to shorten time to mitigate similar incidents in the future.

1779793034 - 1779801528 Resolved

Intermittent errors with app installation token authentication

On May 23, 2026 between 06:00 UTC and 19:12 UTC, GitHub experienced intermittent errors authenticating GitHub app installation tokens.

During this time, between 1-5% of app installation token authentication requests failed, with an average of 2.3% and the error rate peaking at approximately 5.4% around 14:00 UTC. Users may have experienced authentication failures when using GitHub Apps, including failures in Git operations and API calls using app installation tokens.

The issue was caused by an issue in a caching proxy component and was remediated by rolling back that component to a previous version. We are taking steps to improve monitoring for cache miss anomalies to ensure that token authentication remains functional during infrastructure changes and reviewing our protocol for testing and when we upgrade third-party dependencies.

1779552055 - 1779564746 Resolved

Incident with Actions

On May 20, 2026, between 16:00 UTC and 17:45 UTC, GitHub Actions customers experienced run start delays exceeding 5 minutes. Approximately 4.5% of all runs were delayed during the impact window, with scale set jobs disproportionately affected. 30% of scale set jobs were delayed and 4% failed to start entirely. The incident was caused by a misconfigured health check on an internal service that assigns jobs to runners. A brief latency spike in an upstream dependency triggered health check failures across several pods, removing them from service and concentrating load on the remaining capacity. The added load drove memory pressure that escalated into a cascading failure in one regional cluster, leaving it unable to self-recover. Responders mitigated the incident by scaling capacity in the healthy regional clusters and draining traffic away from the impaired one, after which run start latency recovered. To prevent recurrence, we are strengthening our health check configuration to avoid cascading failure scenarios and evaluating automated mitigations to rebalance traffic when a region is degraded.

1779296332 - 1779308056 Resolved

Actions is experiencing degraded availability

On May 15, 2026, from approximately 07:43 UTC to 08:48 UTC, GitHub Actions experienced a degradation that caused workflow runs to fail or experience delayed starts for a subset of customers. The incident was triggered by a planned failover of supporting infrastructure used by GitHub Actions. During that operation, an automated service discovery update did not propagate correctly, which caused traffic to be routed incorrectly and increased request timeouts in a core dependency for workflow orchestration. At peak impact, 42% of Actions runs failed. Downstream services that depend on Actions workflow execution were also impacted, including GitHub Pages and Copilot cloud services. At 08:12 UTC, responders manually corrected the service discovery routing issue. Timeout and failure rates recovered shortly after, and we continued monitoring until full stabilization was confirmed across all affected services. The incident was marked resolved at 08:48 UTC. To prevent recurrence, we are implementing failover guardrails that validate service discovery state before completing failover operations, strengthening pre-flight and post-flight verification checks, and improving dependency resilience to reduce timeout cascades during infrastructure events.

1778832824 - 1778834910 Resolved
⮜ Previous