Disruption with some GitHub services


Incident resolved in 3h59m7s

Resolved

Between 15:20 and 20:18 UTC on Thursday April 2, Copilot Cloud Agent entered a period of reduced performance. Due to an internal feature being developed for Copilot Code Review, the Copilot Cloud Agent infrastructure started to receive an increased number of jobs. This load eventually caused us to hit an internal rate limit, causing all work to suspend for an hour. During this hour, some new jobs would time out, while others would resume once rate limiting ended. Roughly 40% of jobs in this period were affected.Once the cause of this rate limiting was identified, we were able to disable the new CCR feature via a feature flag. Once the jobs that were already in the queue were able to clear, we didn't see additional instances of rate limiting afterwards.

1775166523

Update

The degradation has been mitigated. We are monitoring to ensure stability.

1775166522

Investigating

Although we are observing recovery once again, we expect continued periods of degradation. Work that is queued during times of degradation does eventually get processed. We continue to investigate and find a mitigation, and will update again within 2 hours.

1775162153

Investigating

This issue has recurred. Customers will once again experience false job starts when assigning tasks to Copilot Cloud Agent. We are still investigating and trying to understand the pattern of degradation.

1775158123

Investigating

We are once again seeing recovery with Copilot Cloud Agent job starts. We are keeping this open while we verify this won't recur.

1775154350

Investigating

When assigning tasks to Copilot Cloud Agent, the task will appear to be working, but may not actually be running.We are investigating.

1775152763

Investigating

We are investigating reports of impacted performance for some GitHub services.

1775152176