Incident History

Copilot Chat - Grok Code Fast 1 Outage

On Jan 21st, 2025, between 11:15 UTC and 13:00 UTC the Copilot service was degraded for Grok Code Fast 1 model. On average, more than 90% of the requests to this model failed due to an issue with an upstream provider. No other models were impacted.The issue was resolved after the upstream provider fixed the problem that caused the disruption. GitHub will continue to enhance our monitoring and alerting systems to reduce the time it takes to detect and mitigate similar issues in the future.

1768995230 - 1768999139 Resolved

Run start delays in Actions

On January 20, 2026, between 19:08 UTC and 20:18 UTC, manually dispatched GitHub Actions workflows saw delayed job starts. GitHub products built on Actions such as Dependabot, Pages builds, and Copilot coding agent experienced similar delays. All jobs successfully completed despite the delays. At peak impact, approximately 23% of workflow runs were affected, with an average delay of 11 minutes.This was caused by a load pattern shift in Actions scheduled jobs that saturated a shared backend resource. We mitigated the incident by temporarily throttling traffic and scaling up resources to account for the change in load pattern. To prevent recurrence, we have scaled resources appropriately and implemented optimizations to prevent this load pattern in the future.

1768938579 - 1768939854 Resolved

Incident affecting actions-runner-controller

On January 20, 2026, between 14:39 UTC and 16:03 UTC, actions-runner-controller users experienced a 1% failure rate for API requests managing GitHub Actions runner scale sets. This caused delays in runner creation, resulting in delayed job starts for workflows targeting those runners. The root cause was a service to service circuit breaker that incorrectly tripped for all users when a single user hit rate limits for runner registration. The issue was mitigated by bypassing the circuit breaker, and users saw immediate and full service recovery following the fix.We have updated our circuit breakers to exclude individual customer rate limits from their triggering logic and are continuing work to improve detection and mitigation times.

1768924978 - 1768926189 Resolved

Disruption with some GitHub services

Between 2026-01-16 16:17 and 2026-01-17 02:54 UTC, some Copilot Business users were unable to access and use certain Copilot features and models. This was due to a bug with how we determine if a user has access to a feature, inadvertently marking features and models as inaccessible for users whose enterprise(s) had not configured the policy.We mitigated the incident by reverting the problematic deployment. We are improving our internal monitoring and mitigation processes to reduce the risk and extended downtime of similar incidents in the future.

1768607623 - 1768618491 Resolved

[Retroactive] Incident with GitHub Copilot (GPT-5 model)

From January 14, 2026, at 18:15 UTC until January 15, 2026, at 11:30 UTC, GitHub Copilot users were unable to select the GPT-5 model for chat features in VS Code, JetBrains IDEs, and other IDE integrations. Users running GPT-5 in Auto mode experienced errors. Other models were not impacted.

We mitigated this incident by deploying a fix that corrected a misconfiguration in available models, rendering the GPT-5 model available again.

We are improving our testing processes to reduce the risk of similar incidents in the future, and refining our model availability alerting to improve detection time.

We did not status before we completed the fix, and the incident is currently resolved. We are sorry for the delayed post on githubstatus.com.

1768589795 - 1768589795 Resolved

Incident with Issues and Pull Requests

On January 15, 2026, between 16:40 UTC and 18:20 UTC, we observed increased latency and timeouts across Issues, Pull Requests, Notifications, Actions, Repositories, API, Account Login and Alive. An average 1.8% of combined web and API requests saw failure, peaking briefly at 10% early on. The majority of impact was observed for unauthenticated users, but authenticated users were impacted as well.This was caused by an infrastructure update to some of our data stores. Upgrading this infrastructure to a new major version resulted in unexpected resource contention, leading to distributed impact in the form of slow queries and increased timeouts across services that depend on these datasets. We mitigated this by rolling back to the previous stable version.We are working to improve our validation process for these types of upgrades to catch issues that only occur under high load before full release, improve detection time, and reduce mitigation times in the future.

1768496204 - 1768503248 Resolved

Actions workflow run and job status updates are experiencing delays

On January 15th, between 14:18 UTC and 15:26 UTC, customers experienced delays in status updates for workflow runs and checks. Status updates were delayed by up to 20 minutes, with a median delay of 11 minutes.The issue stemmed from an infrastructure upgrade to our database cluster. The new version introduced resource contention under production load, causing slow query times. We mitigated this by rolling back to the previous stable version. We are working to strengthen our upgrade validation process to catch issues that only manifest under high load. We are also adding new monitors to reduce detection time for similar issues in the future.

1768487063 - 1768490788 Resolved

Incident with Webhooks

On January 14, 2026, between 19:34 UTC and 21:36 UTC, the Webhooks service experienced a degradation that delayed delivery of some webhooks. During this window, a subset of webhook deliveries that encountered proxy tunnel errors on their initial delivery attempt were delayed by more than two minutes. The root cause was a recent code change that added additional retry attempts for this specific error condition, which increased delivery times for affected webhooks. Previously, webhook deliveries encountering this error would not have been delivered.The incident was mitigated by rolling back the change, restoring normal webhook delivery. As a corrective action, we will update our monitoring to measure the webhook delivery latency critical path, ensuring that incidents are accurately scoped to this workflow.

1768422062 - 1768426689 Resolved

Claude Opus 4.5 model experiencing degraded performance

On January 14th, 2026, between approximately 10:20 and 11:25 UTC, the Copilot service experienced a degradation of the Claude Opus 4.5 model due to an issue with our upstream provider. During this time period, users encountered a 4.5% error rate when using Claude Opus 4.5. No other models were impacted.The issue was resolved by a mitigation put in place by our provider. GitHub is working with our provider to further improve the resiliency of the service to prevent similar incidents in the future.

1768388177 - 1768393408 Resolved

Copilot's GPT-5.1 model has degraded performance

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1768382676 - 1768387931 Resolved
⮜ Previous Next ⮞