Incident History

We are investigating reports of degraded performance.

Between May 19th 3:40AM UTC and May 20th 5:40PM UTC the service responsible for rendering Jupyter notebooks was degraded. During this time customers were unable to render Jupyter Notebooks.This occurred due to an issue with a Redis dependency which was mitigated by restarting. An issue with our monitoring led to a delay in our response. We are working to improve the quality and accuracy of our monitors to reduce the time to detection.

1716223630 - 1716224748 Resolved

Incident with Actions

On May 16, 2024, between 4:10 UTC and 5:02 UTC customers experienced various delays in background jobs, primarily UI updates for Actions. This issue was due to degradation in our background job service affecting 22.4% of total jobs. Across all affected services, the average job delay was 2m 22s. Actions jobs themselves were unaffected, this issue affected the timeliness of UI updates, with an average delay of 11m 40s and a maximum of 20m 14s.This incident was due to a performance problem on a single processing node, where Actions UI updates were being processed. Additionally, a misconfigured monitor did not alert immediately, resulting in a 25m late detection time and a 37m total increase in time to mitigate. We mitigated the incident by removing the problem node from the cluster and service was restored. No data was lost, and all jobs executed successfully.To reduce our time to detection and mitigation of issues like this one in the future, we have repaired our misconfigured monitor and added additional monitoring to this service.

1715834622 - 1715836535 Resolved
Next ⮞