git operations over ssh seeing increased latency on github.com
We are currently investigating this issue.
We are currently investigating this issue.
This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
On October 21, 2025, between 13:30 and 17:30 UTC, GitHub Enterprise Cloud Organization SAML Single Sign-On experienced degraded performance. Customers may have been unable to successfully authenticate into their GitHub Organizations during this period. Organization SAML recorded a maximum of 0.4% of SSO requests failing during this timeframe.This incident stemmed from a failure in a read replica database partition responsible for storing license usage information for GitHub Enterprise Cloud Organizations. This partition failure resulted in users from affected organizations, whose license usage information was stored on this partition, being unable to access SSO during the aforementioned window. A successful SSO requires an available license for the user who is accessing a GitHub Enterprise Cloud Organization backed by SSO.The failing partition was subsequently taken out of service, thereby mitigating the issue. Remedial actions are currently underway to ensure that a read replica failure does not compromise the overall service availability.
This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
From October 20th at 14:10 UTC until 16:40 UTC, the Copilot service experienced degradation due to an infrastructure issue which impacted the Grok Code Fast 1 model, leading to a spike in errors affecting 30% of users. No other models were impacted. The incident was caused due to an outage with an upstream provider.
On October 20, 2025, between 08:05 UTC and 10:50 UTC the Codespaces service was degraded, with users experiencing failures creating new codespaces and resuming existing ones. On average, the error rate for codespace creation was 39.5% and peaked at 71% of requests to the service during the incident window. Resume operations averaged 23.4% error rate with a peak of 46%. This was due to a cascading failure triggered by an outage in a 3rd-party dependency required to build devcontainer images.The impact was mitigated when the 3rd-party dependency recovered.We are investigating opportunities to make this dependency not a critical path for our container build process and working to improve our monitoring and alerting systems to reduce our time to detection of issues like this one in the future.
On October 17th, 2025, between 12:51 UTC and 14:01 UTC, mobile push notifications failed to be delivered for a total duration of 70 minutes. This affected github.com and GitHub Enterprise Cloud in all regions. The disruption was related to an erroneous configuration change to cloud resources used for mobile push notification delivery.We are reviewing our procedures and management of these cloud resources to prevent such an incident in the future.
On October 14th, 2025, between 18:26 UTC and 18:57 UTC a subset of unauthenticated requests to the commit endpoint for certain repositories received 503 errors. During the event, the average error rate was 3%, peaking at 3.5% of total requests.This event was triggered by a recent configuration change and some traffic pattern shifts on the service. We were alerted of the issue immediately and made changes to the configuration in order to mitigate the problem. We are working on automatic mitigation solutions and better traffic handling in order to prevent issues like this in the future.
On Oct 14th, 2025, between 13:34 UTC and 16:00 UTC the Copilot service was degraded for GPT-5 mini model. On average, 18% of the requests to GPT-5 mini failed due to an issue with our upstream provider.We notified the upstream provider of the problem as soon as it was detected and mitigated the issue by failing over to other providers. The upstream provider has since resolved the issue.We are working to improve our failover logic to mitigate similar upstream failures more quickly in the future.