Incident History

Code view fails to load when content contains some non-ASCII characters

Starting February 26, 2026 at 22:10 UTC through February 27, 05:50 UTC, the repository browsing UI was degraded and users were unable to load pages for files and directories with non-ASCII characters (including Japanese, Chinese, and other non-Latin scripts). On average, the error rate was 0.014% and peaked at 0.06% of requests to the service. Affected users saw 404 errors when navigating to repository directories and files with non-ASCII names. This was due to a code change that altered how file and directory names were processed, which caused incorrectly formatted data to be stored in an application cache.We mitigated the incident by deploying a fix that invalidated the affected cache entries and progressively rolling it out across all production environments.We are working to improve our pre-production testing to cover non-ASCII character handling, establish better cache invalidation mechanisms, and enhance our monitoring to detect this type of failure mode earlier, to reduce our time to detection and mitigation of issues like this one in the future.

1772161703 - 1772172242 Resolved

High latency on webhook API requests

Between February 26, 2026 UTC and February 27, 2026 UTC, customers hitting the webhooks delivery API may have experienced higher latency or failed requests. During the impact window, 0.82% of requests took longer than 3s and 0.004% resulted in a 500 error response.Our monitors caught the impact on the individual backing data source, and we were able to attribute the degradation to a noisy neighbor effect due requests to a specific webhook generating excessive load on the API. The incident was mitigated once traffic from the specific hook decreased.We have since added a rate limiter for this webhooks API to prevent similar spikes in usage impacting others and will further refine the rate limits for other webhook API routes to help prevent similar occurrences in the future.

1772150505 - 1772150641 Resolved

Incident with Copilot

On February 26, 2026, between 09:27 UTC and 10:36 UTC, the GitHub Copilot service was degraded and users experienced errors when using Copilot features including Copilot Chat, Copilot Coding Agent and Copilot Code Review. During this time, 5-15% of affected requests to the service returned errors.The incident was resolved by infrastructure rebalancing.We are improving observability to detect capacity imbalances earlier and enhancing our infrastructure to better handle traffic spikes.

1772101320 - 1772103992 Resolved

Incident with Copilot Agent Sessions impacting CCA/CCR

On February 25, 2026, between 15:05 UTC and 16:34 UTC, the Copilot coding agent service was degraded, resulting in errors for 5% of all requests and impacting users starting or interacting with agent sessions. This was due to an internal service dependency running out of allocated resources (memory and CPU). We mitigated the incident by adjusting the resource allocation for the affected service, which restored normal operations for the coding agent service.We are working to implement proactive monitoring for resource exhaustion across our services, review and update resource allocations, and improve our alerting capabilities to reduce our time to detection and mitigation of similar issues in the future.

1772037491 - 1772037890 Resolved

Incident with Issues and Pull Requests Search

On February 23, 2026, between 21:01 UTC and 21:30 UTC the Search service experienced degraded performance, resulting in an average of 3.5% of search requests for Issues and Pull Requests being rejected. During this period, updates to Issues and Pull Requests may not have been immediately reflected in search results. During a routine migration, we observed a spike in internal traffic due to a configuration change in our search index. We were alerted to the increase in traffic as well as the increase in error rates and rolled back to the previous stable index. We are working to enable more controlled traffic shifting when promoting a new index to allow us to detect potential limitations earlier and ensure these operations succeed in a more controlled manner.

1771881409 - 1771882242 Resolved

Code search experiencing degraded performance

Between 2026-02-23 19:10 and 2026-02-24 00:46 UTC, all lexical code search queries in GitHub.com and the code search API were significantly slowed, and during this incident, between 5 and 10% of search queries timed out. This was caused by a single customer who had created a network of hundreds of orchestrated accounts which searched with a uniquely expensive search query. This search query concentrated load on a single hot shard within the search index, slowing down all queries. After we identified the source of the load and stopped the traffic, latency returned to normal.To avoid this situation occurring again in the future, we are making a number of improvements to our systems, including: improved rate limiting that accounts for highly skewed load on hot shards, improved system resilience for when a small number of shards time out, improved tooling to recognize abusive actors, and capabilities that will allow us to shed load on a single shard in emergencies.

1771876795 - 1771893989 Resolved

Incident with Actions

On February 23, 2026, between 15:00 UTC and 17:00 UTC, GitHub Actions experienced degraded performance. During the time, 1.8% of Actions workflow runs experienced delayed starts with an average delay of 15 minutes. The issue was caused by a connection rebalancing event in our internal load balancing layer, which temporarily created uneven traffic distribution across sites and led to request throttling. To prevent recurrence, we are tuning connection rebalancing behavior to spread client reconnections more gradually during load balancer reloads. We are also evaluating improvements to site-level traffic affinity to eliminate the uneven distribution at its source. We have overprovisioned critical paths to prevent any impact if a similar event occurs before those workstreams finish. Finally, we are enhancing our monitoring to detect capacity imbalances proactively.

1771863455 - 1771866236 Resolved

Incident with Copilot

On February 23, 2026, between 14:45 UTC and 16:19 UTC, the Copilot service was degraded for Claude Haiku 4.5 model. On average, 6% of the requests to this model failed due to an issue with an upstream provider. During this period, automated model degradation notifications directed affected users to alternative models. No other models were impacted. The upstream provider identified and resolved the issue on their end. We are working to improve automatic model failover mechanisms to reduce our time to mitigation of issues like this one in the future.

1771858569 - 1771863573 Resolved

Extended job start delays for larger hosted runners

On February 20, 2026, between 17:45 UTC and 20:41 UTC, 4.2% of workflows running on GitHub Larger Hosted Runners were delayed by an average of 18 minutes. Standard, Mac, and Self-Hosted Runners were not impacted. The delays were caused by communication failures between backend services for one deployment of larger runners. Those failures prevented expected automated scaling and provisioning of larger hosted runner capacity within that deployment. This was mitigated when the affected infrastructure was recycled, larger runner pools in the affected deployment successfully scaled up, and queued jobs processed. We are working to improve the time to detect and diagnose this class of failures and improve the performance of recovery mechanisms for this degraded network state. In addition, we have architectural changes underway that will enable other deployments to pick up work in similar situations, so there is no customer impact due to deployment-specific infrastructure issues like this.

1771617607 - 1771620084 Resolved

Incident with Copilot GPT-5.1-Codex

On February 20, 2026, between 07:30 UTC and 11:21 UTC, the Copilot service experienced a degradation of the GPT 5.1 Codex model. During this time period, users encountered a 4.5% error rate when using this model. No other models were impacted.The issue was resolved by a mitigation put in place by the external model provider. GitHub is working with the external model provider to further improve the resiliency of the service to prevent similar incidents in the future.

1771581739 - 1771587699 Resolved
⮜ Previous Next ⮞