Incident History

Creating MongoDB Managed Databases

After investigation, our Engineering team was able to determine that an internal-only issue was impacting MongoDB creation for DigitalOcean employees.

No customer impact occurred during the course of this incident.

We apologize for any confusion. If you are seeing any issues, please out to Support from within your account.

1729032888 - 1729034751 Resolved

Spaces Page in Control Panel

Our Engineering team has confirmed resolution of the issue impacting display of Spaces page in Cloud control panel.

From 17:35 UTC to 18:00 UTC, users were experiencing errors when accessing Spaces page in the Cloud Control Panel.

We apologize for the inconvenience. If you have any questions or continue to experience issues, please reach out via a Support ticket on your account.

1729014966 - 1729017896 Resolved

Spaces and Container Registry in SYD1 region

From 23:00 UTC on October 10th until 13:28 UTC on October 11th, users may have encountered intermittent errors when accessing Spaces endpoints or using the Container Registry in the SYD1 region. Our Engineering team has successfully resolved the issue affecting the functionality of Spaces and the Container Registry.

All services have been fully restored and are now functioning normally. If you continue to experience any problems, please open a ticket with our support team. We apologize for any inconvenience caused.

1728652688 - 1728661850 Resolved

Creating or Regenerating Spaces Access Key

Our Engineering team has confirmed the resolution of the issue impacting Spaces Access Keys.

From 15:21 UTC to 17:52 UTC, users were experiencing errors when creating or regenerating Spaces Access Keys in the Cloud Control Panel.

We apologize for the inconvenience. If you have any questions or continue to experience issues, please reach out via a Support ticket on your account.

1727977924 - 1727982300 Resolved

Networking in NYC region

From 07:45 to 08:55 UTC, users may have experienced networking connectivity issues in our NYC region.

Our Engineering team has confirmed the full resolution of the issue. Users should be able to access all resources as normal.

If you continue to experience problems, please open a ticket with our support team. We apologize for any inconvenience.

1727512149 - 1727521741 Resolved

New Customer Sign Ups

As of 17:05 UTC, our Engineering team has resolved the issue affecting new account sign-ups. Users should no longer experience errors and are now able to complete the sign-up process successfully.

If you continue to experience problems, please open a ticket with our support team. We apologize for any inconvenience.

1727455284 - 1727461005 Resolved

Control Plane in SYD1

From 18:22 UTC to 19:13 UTC, users may have experienced issues or errors when attempting to create or modify DigitalOcean services deployed in the SYD1 region and also when attempting to create or manage Volumes globally.

Our Engineering team has confirmed the full resolution of this issue. If you continue to experience problems, please open a ticket with our support team. Thank you for being so patient, and we apologize for any inconvenience.

1727377698 - 1727383381 Resolved

Network Connectivity in SFO3 Region

Incident Summary

On September 25, 2024 at 22:25 UTC, DigitalOcean experienced a reduction of datacenter capacity in SFO3 and impacted the availability of select DigitalOcean services. Due to a majority of the line cards rebooting at the same time on one of our core routers in SFO3, an inter-regional traffic interruption and traffic drop to the network backbone occurred. This issue impacted users of any DigitalOcean services in the SFO3 region, with a longer impact on select Managed Kubernetes Clusters (DOKS). 

Incident Details

Networking

Specific Impact on DOKS

Timeline of Events

Sep 25 22:21 - Large majority of line cards rebooted on the core router.

Sep 25 22:24 - Line cards became online.

Sep 25 22:25 - Network protocols started session establishment process.

Sep 25 22:30 - Traffic on the affected core router was restored.

Sep 25 22:50 - SFO3 control plane systems all reconnected and recovered. 

Sep 25 23:07 - DOKS API servers degraded.

Sep 25 23:59 - Some DOKS clusters in the SFO3 region could not be scraped. Several nodes were discovered to be in a “not ready” state.

Sep 26 01:40 - All impacted DOKS nodes recycled and clusters are operational. 

Remediation Actions

DigitalOcean teams are working on multiple types of remediation to help prevent a similar incident from happening in the future. 

DigitalOcean is working with the vendor support team for the devices to determine the root cause of the line card crash, as well as upgrading software on the core routers in the SFO3 region.. 

During the incident, engineers had to manually remediate affected nodes across the entire SFO3 DOKS fleet to restore service. Teams are exploring methods to reduce the need for manual action in the future, by increasing thresholds for automated remediation actions, such that service is restored as quickly as possible.

1727307294 - 1727318702 Resolved

Droplet Event Processing and API Availability

Our Engineering team has identified and resolved an issue that impacted the ability to resize Droplets via both the API and UI from 18:15 until 21:55 UTC. During this time, users might have experienced errors when attempting to resize their Droplets through the API or the UI.

Additionally, in an effort to resolve the issue with resizes, a secondary issue affected all event processing and some API calls for Droplets and related services from 21:50 until 22:00 UTC.

Swift action was taken by our Engineering team to restore full functionality, and now everything is operating normally.

We apologize for any inconvenience this may have caused. If you have any questions or continue to experience issues, please reach out via a Support ticket on your account.

1727305177 - 1727305177 Resolved

Resizing MongoDB Managed Databases

From 00:01 to 15:15 UTC, users may have experienced errors when attempting to resize their MongoDB Managed Database clusters.

Our Engineering team has confirmed the full resolution of issue. Users should now be able to resize their MongoDB clusters normally.

If you continue to experience problems, please open a ticket with our support team from within your Cloud Control Panel.

1727277264 - 1727280611 Resolved
⮜ Previous Next ⮞