Incident History

Unauthenticated LFS requests for public repos are returning unexpected 401 errors

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1758045311 - 1758047408 Resolved

Creating GitHub apps using the REST API will fail with a 401 error

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1758042885 - 1758044722 Resolved

Disruption with some GitHub services

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1757960485 - 1757960916 Resolved

Repository search is degraded

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

1757767440 - 1757970063 Resolved

Incident with Actions

On September 10, 2025 between 13:00 and 14:15 UTC, Actions users experienced failed jobs and run start delays for Ubuntu 24 and Ubuntu 22 jobs on standard runners in private repositories. Additionally, larger runner customers experienced run start delays for runner groups with private networking configured in the eastus2 region. This was due to an outage in an underlying compute service provider in eastus2. 1.06% of Ubuntu 24 jobs and 0.16% of Ubuntu 22 jobs failed during this period. Jobs for larger runners using private networking in the eastus2 region were unable to start for the duration of the incident.We have identified and are working on improvements in our resilience to single partner region outages for standard runners so impact is reduced in similar scenarios in the future.

1757510585 - 1757512961 Resolved

Degraded REST API success rates for some customers

On September 4, 2025 between 15:30 UTC and 20:00 UTC the REST API endpoints git/refs, git/refs/, and git/matching-refs/ were degraded and returned elevated errors for repositories with reference counts over 22k. On average, the request error rate to these specific endpoints was 0.5%. Overall REST API availability remained 99.9999%. This was due to the introduction of a code change that added latency to reference evaluations and overly affected repositories with many branches, tags, or other references.We mitigated the incident by reverting the new code.We are working to improve performance testing and to reduce our time to detection and mitigation of issues like this one in the future.

1757009812 - 1757017547 Resolved

Loading avatars might fail for a 0.5% of total users and 100% users around the Arabian Peninsula. We are investigating.

Between August 21, 2025 at 15:00 UTC, and September 2, 2025 at 15:22 UTC the avatars.githubusercontent.com image service was degraded and failed to display user avatars for users in the Middle East. During this time, avatar images appeared broken on github.com for affected users. On average, this impacted about 82% of users routed through one of our Middle East-based points-of-presence, which represents about 0.14% of global users.This was due to a configuration change within GitHub's edge infrastructure in the affected region, causing HTTP requests to fail. As a result, image requests returned HTTP 503 errors. The failure to detect the issues was the result of an alerting threshold set too low.We mitigated the incident by removing the affected site from service, which restored avatar serving for impacted users.To prevent this from recurring, we have tuned configuration defaults for graceful degradation. We also added new health checks to automatically shift traffic from impacted sites. We are updating our monitoring to prevent undetected errors like this in the future.

1756826228 - 1756827881 Resolved

Disruption with some GitHub services

On August 27, 2025 between 20:35 and 21:17 UTC, Copilot, Web and REST API traffic experienced degraded performance. Copilot saw an average of 36% of requests fail with a peak failure rate of 77%. Approximately 2% of all non-Copilot Web and REST API traffic requests failed.

This incident occurred after we initiated a production database migration to drop a column from a table backing copilot functionality. While the column was no longer in direct use, our ORM continued to reference the dropped column. This led to a large number of 5xx responses and was similar to the incident on August 5th. At 21:15 UTC, we applied a fix to the production schema and by 21:17 UTC, all services had fully recovered.

While repairs were in progress to avoid this situation, they were not completed quickly enough to prevent a second incident. We have now implemented a temporary block for all drop column operations as an immediate solution while we add more safeguards to prevent similar issues from occurring in the future. We are also implementing graceful degradation so that Copilot issues will not impact other features of our product.

1756327299 - 1756330078 Resolved

Incident with Actions

On August 21, 2025, from approximately 15:37 UTC to 18:10 UTC, customers experienced increased delays and failures when starting jobs on GitHub Actions using standard hosted runners. This was caused by connectivity issues in our East US region, which prevented runners from retrieving jobs and sending progress updates. As a result, capacity was significantly reduced, especially for busier configurations, leading to queuing and service interruptions. Approximately 8.05% of jobs on public standard Ubuntu24 runners and 3.4% of jobs on private standard Ubuntu24 runners did not start as expected.By 18:10 UTC, we had mitigated the issue by provisioning additional resources in the affected region and burning down the backlog of queued runner assignments. By the end of that day, we deployed changes to improve runner connectivity resilience and graceful degradation in similar situations. We are also taking further steps to improve system resiliency by enhancing observability of network connection health with runners and improving load distribution and failover handling to help prevent similar issues in the future.

1755791674 - 1755799982 Resolved

Incident with Issues and Git Operations

On August 21st, 2025, between 6:15am UTC and 6:25am UTC Git and Web operations were degraded and saw intermittent errors. On average, the error rate was 1% for API and Web requests. This was due to database infrastructure automated maintenance reducing capacity below our tolerated threshold.The incident was resolved when the impacted infrastructure self-healed and returned to normal operating capacity.We are adding guardrails to reduce the impact of this type of maintenance in the future.

1755757533 - 1755759513 Resolved
⮜ Previous