Delays in changes to organization membership


Incident resolved in 5h17m33s

Resolved

On June 28th, 2024, at 16:06 UTC, a backend update by GitHub triggered a significant number of long-running Organization membership update jobs in our job processing system. The job queue depth rose as these update jobs consumed most of our job worker capacity. This resulted in delays for other jobs across services such as Pull Requests and PR-related Actions workflows. We mitigated the impact to Pull Requests and Actions at 19:32 UTC by pausing all Organization membership update jobs. We deployed a code change at 22:30 UTC to skip over the jobs queued by the backend change and re-enabled Organization membership update jobs. We restored the Organization membership update functionality at 22:52 UTC, including all membership changes queued during the incident.During the incident, about 15% of Action workflow runs experienced a delay of more than five minutes. In addition, Pull Requests had delays in determining merge eligibility and starting associated Action workflows for the duration of the incident. Organization membership updates saw delays for upwards of five hours.To prevent a similar event in the future from impacting our users, we are working to: improve our job management system to better manage our job worker capacity; add more precise monitoring for job delays; and strengthen our testing practices to prevent future recurrences.

1719615100

Investigating

We are continuing to work to mitigate delays in organization membership changes.

1719613086

Investigating

We are still actively working to mitigate delays in organization membership changes.

1719611137

Investigating

We are actively working to mitigate delays in organization membership changes. Actions and Pull Requests are both functioning normally now.

1719607562

Investigating

Actions is operating normally.

1719604821

Investigating

Pull Requests is operating normally.

1719604799

Investigating

We are continuing to apply mitigations and are seeing improvement in creating pull request merge commits and Actions runs for pull request events. Applying changes to organization members remains delayed.

1719604303

Investigating

We are continuing to work on mitigating delays creating pull request merge commits, Actions runs for pull request events, and changes to organization members.

1719601384

Investigating

Actions runs triggered by pull requests are experiencing start delays. We have engaged the appropriate teams and are investigating the issue.

1719597559

Investigating

Pull Requests is experiencing degraded performance. We are continuing to investigate.

1719597507

Investigating

We are investigating reports of degraded performance for Actions

1719596047