Degraded API Performance


Incident resolved in 12h4m45s

Resolved

This incident has been resolved.

1732608900

Update

We've scaled up our systems and applied fixes to our API. Everything should be operational now.

1732607532

Update

We are scaling up our systems to handle the increased traffic

1732599815

Update

All hosts have completed the restoration process and we are seeing our overall Corrosion cluster health and performance return to normal.

Machine API and GraphQL API error rates are improving, but some users may still see elevated rates of request timeouts and/or 504 errors when using the Machines API or Flyctl commands. We are continuing to monitor these services as they recover.

1732592533

Update

The restore process has completed on the majority of hosts in our fleet and we are seeing overall Corrosion cluster health and performance return to normal.

There are a small number of hosts that are still being worked on, we aim to have them restored shortly.

1732588268

Update

We are running a restoration and reseed process to bring the Corrosion cluster back to a healthy, current state. During this restoration process, you may see elevated error rates on machines or apps that have been recently updated.

1732586765

Update

The updates have been applied, however we are still not seeing recovery on all Corrosion nodes. We are continuing to work on a fix.

The machines API and proxy performance remains in a degraded state, especially with newly created and updated machines.

1732579089

Update

The Machines API issues stem from a propagation delay in our global state store, Corrosion.

We have completed deploying a configuration change to our Corrosion cluster and will be applying these changes to each node shortly. We expect improvement once the changes are applied.

In the meantime users may still see degraded machines API and proxy performance, especially with newly created machines

1732572949

Update

The issue has been identified and a fix is being implemented.

1732566037

Investigating

We are investigating degraded API performance

1732565415