Incident History

Degraded Experience - Failing to finalize some CCA Jobs

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1769806791 - 1769808143 Resolved

Actions Workflows Run Start Delays

On Jan 28, 2026, between 14:56 UTC and 15:44 UTC, GitHub Actions experienced degraded performance. During this time, workflows experienced an average delay of 49 seconds, and 4.7% of workflow runs failed to start within 5 minutes. The root cause was an atypical load pattern that overwhelmed system capacity and caused resource contention.Recovery began once additional resources came online at 15:25 UTC, with full recovery at 15:44 UTC. We are implementing safeguards to prevent this failure mode and enhancing our monitoring to detect and address similar patterns more quickly in the future.

1769613161 - 1769615644 Resolved

Regression in windows runners for public repositories

On Jan 26, 2026, from approximately 14:03 UTC to 23:42 UTC, GitHub Actions experienced job failures on some Windows standard hosted runners. This was caused by a configuration difference in a new Windows runner type that caused the expected D: drive to be missing. About 2.5% of all Windows standard runners jobs were impacted. Re-run of failed workflows had a high chance of succeeding given the limited rollout of the change.The job failures were mitigated by rolling back the affected configuration and removing the provisioned runners that had this configuration. To reduce the chance of recurrence, we are expanding runner telemetry and improving validation of runner configuration changes. We are also evaluating options to accelerate the mitigation time of any similar future events.

1769455551 - 1769471504 Resolved

Disruption with repo creation

Between January 24, 2026,19:56 UTC and January 25, 2026, 2:50 UTC repository creation and clone were degraded. On average, the error rate was 25% and peaked at 55% of requests for repository creation. This was due to increased latency on the repositories database impacting a read-after-write problem during repo creation. We mitigated the incident by stopping an operation that was generating load on the database to increase throughput. We have identified the repository creation problem and are working to address the issue and improve our observability to reduce our time to detection and mitigation of issues like this one in the future.

1769309021 - 1769310513 Resolved

Disruption with some GitHub services

On January 22, 2026, our authentication service experienced an issue between 14:00 UTC and 14:50 UTC, resulting in downstream disruptions for users.From 14:00 UTC to 14:23 UTC, authenticated API requests experienced higher-than-normal error rates, with an average of 16.9% and occasional peaks up to 22.2% resulting in HTTP 401 responses for authenticated API requests. From 14:00 UTC to 14:50 UTC, git operations over HTTP were impacted, with error rates averaging 3.8% and peaking at 10.8%. As a result, some users may have been unable to run git commands as expected.This was due to the authentication service reaching the maximum allowed number of database connections. We mitigated the incident by increasing the maximum number of database connections in the authentication service.We are adding additional monitoring around database connection pool usage and improving our traffic projection to reduce our time to detection and mitigation of issues like this one in the future.

1769091169 - 1769095379 Resolved

Policy pages for Copilot are timing out

On January 21, between 17:50 and 20:53 UTC, around 350 enterprises and organizations experienced slower load times or timeouts when viewing Copilot policy pages. The issue was traced to performance degradation under load due to an issue in upstream database caching capability within our billing infrastructure, which increased query latency to retrieve billing and policy information from approximately 300ms to up to 1.5s.To restore service, we disabled the affected caching feature, which immediately returned performance to normal. We then addressed the issue in the caching capability and re-enabled our use of the database cache and observed continued recovery.Moving forward, we’re tightening our procedures for deploying performance optimizations, adding test coverage, and improving cross-service visibility and alerting so we can detect upstream degradations earlier and reduce impact to customers.

1769023865 - 1769028786 Resolved

Copilot Chat - Grok Code Fast 1 Outage

On Jan 21st, 2025, between 11:15 UTC and 13:00 UTC the Copilot service was degraded for Grok Code Fast 1 model. On average, more than 90% of the requests to this model failed due to an issue with an upstream provider. No other models were impacted.The issue was resolved after the upstream provider fixed the problem that caused the disruption. GitHub will continue to enhance our monitoring and alerting systems to reduce the time it takes to detect and mitigate similar issues in the future.

1768995230 - 1768999139 Resolved

Run start delays in Actions

On January 20, 2026, between 19:08 UTC and 20:18 UTC, manually dispatched GitHub Actions workflows saw delayed job starts. GitHub products built on Actions such as Dependabot, Pages builds, and Copilot coding agent experienced similar delays. All jobs successfully completed despite the delays. At peak impact, approximately 23% of workflow runs were affected, with an average delay of 11 minutes.This was caused by a load pattern shift in Actions scheduled jobs that saturated a shared backend resource. We mitigated the incident by temporarily throttling traffic and scaling up resources to account for the change in load pattern. To prevent recurrence, we have scaled resources appropriately and implemented optimizations to prevent this load pattern in the future.

1768938579 - 1768939854 Resolved

Incident affecting actions-runner-controller

On January 20, 2026, between 14:39 UTC and 16:03 UTC, actions-runner-controller users experienced a 1% failure rate for API requests managing GitHub Actions runner scale sets. This caused delays in runner creation, resulting in delayed job starts for workflows targeting those runners. The root cause was a service to service circuit breaker that incorrectly tripped for all users when a single user hit rate limits for runner registration. The issue was mitigated by bypassing the circuit breaker, and users saw immediate and full service recovery following the fix.We have updated our circuit breakers to exclude individual customer rate limits from their triggering logic and are continuing work to improve detection and mitigation times.

1768924978 - 1768926189 Resolved

Disruption with some GitHub services

Between 2026-01-16 16:17 and 2026-01-17 02:54 UTC, some Copilot Business users were unable to access and use certain Copilot features and models. This was due to a bug with how we determine if a user has access to a feature, inadvertently marking features and models as inaccessible for users whose enterprise(s) had not configured the policy.We mitigated the incident by reverting the problematic deployment. We are improving our internal monitoring and mitigation processes to reduce the risk and extended downtime of similar incidents in the future.

1768607623 - 1768618491 Resolved
⮜ Previous