On 7/3/2025, between 5:21:17 AM and 7:11:49 AM UTC, customers were prevented from SSO authorizing Personal Access Tokens and SSH keys via the GitHub UI. Approximately 1300 users were impacted.A code change modified the content type of the response returned by the server, causing a lazily-loaded dropdown to fail to render, prohibiting the user from proceeding to authorize. No authorization systems were impacted during the incident, only the UI component. We mitigated the incident by reverting the code change that introduced the problem.We are making improvements to our release process and test coverage to catch this class of error earlier in our deployment pipeline. Further, we are improving monitoring to reduce our time to detection and mitigation of issues like this one in the future.
On July 2, 2025, between 1:35 AM UTC and 16:23 UTC, the GitHub Enterprise Importer (GEI) migration service experienced degraded performance and slower-than-normal migration queue processing times. This incident was triggered due to a migration including an abnormally large number of repositories, overwhelming the queue and slowing processing for all migrations.We mitigated the incident by removing the problematic migrations from the queue. Service was restored to normal operation as the queue volume was reduced.To ensure system stability, we have introduced additional concurrency controls that limit the number of queued repositories per organization migration, helping to prevent similar incidents in the future.
This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
Due to a degradation of one instance of our internal message delivery service, a percentage of jobs started between 06/30/2025 19:18 UTC and 06/30/2025 19:50 UTC failed, and are no longer retry-able. Runners assigned to these jobs will automatically recover within 24 hours, but deleting and recreating the runner will free up the runner immediately.
This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
On June 26, 2025, between 17:10 UTC and 23:30 UTC, around 40% of attempts to create a repository from a template repository failed. The failures were an unexpected result of a gap in testing and observability.We mitigated the incident by rolling back the deployment.We are working to improve our testing and automatic detection of errors associated with failed template repository creation.
On June 26th, between 14:42UTC and 18:05UTC, the GitHub Enterprise Importer (GEI) service was in a degraded state, during which time, customers of the service experienced extended repository migration durations.Our investigation found that the combined effect of several database updates resulted in the severe throttling of GEI to preserve overall database health.We have taken steps to prevent additional impact and are working to implement additional safeguards to prevent similar incidents from occurring in the future.
This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
Between June 19th, 2025 11:35 UTC and June 20th, 2025 11:20 UTC the GitHub Mobile Android application was unable to login new users. The iOS app was unaffected.This was due to a new GitHub App feature being tested internally, which was inadvertently enforced for all GitHub-owned applications, including GitHub Mobile.A mismatch in client and server expectations due to this feature caused logins to fail. We mitigated the incident by disabling the feature flag controlling the feature.We are working to improve our time to detection and put in place stronger guardrails that reduce impact from internal testing on applications used by all customers.
On June 18, 2025 between 22:20 UTC and 23:00 UTC the Claude Sonnet 3.7 and Claude Sonnet 4 models for GitHub Copilot Chat experienced degraded performance. During the impact, some users would receive an immediate error when making a request to a Claude model. This was due to upstream errors with one of our model providers, which have since been resolved. We mitigated the impact by disabling the affected provider endpoints to reduce user impact, redirecting Claude Sonnet requests to additional partners.We are working to update our incident response playbooks for infrastructure provider outages and improve our monitoring and alerting systems to reduce our time to detection and mitigation of issues like this one in the future.