Management plane outage
Resolved
This incident has been resolved.
Update
A fix has been implemented and we are seeing system performance return to normal. Machine API and general platform operations are succeeding again, although users may see slightly elevated error rates as things finish stabilizing.
We are continuing to closely monitor the platform to ensure full recovery and stability.
Update
We are continuing to work on a fix for this issue.
Update
Services are starting to come up. The dashboard should be accessible and deploys and other flyctl based commands should work. Some services may feel sluggish while things heat up.
Update
The team is getting closer to a fix. We will provide another update within the next 30 minutes.
Update
We are continuing to make progress on a fix for this issue.
Update
We are continuing to work on restoring service to the Machines API and other affected platform components..
Update
We are continuing to work on deploying a fix. The fly.io dashboard and the machines API continue to be unavailable at this time.
Fly Managed Postgres (MPG) clusters continue to run normally, however creating new clusters will fail at this time. Users may also see scheduled backups remain in a running or pending state at this time. These backups will resume as scheduled once the platform level issues are resolved
Update
We have identified the cause of the outage and are working on a fix. The fly.io dashboard and the machines API continue to be unavailable at this time.
Running machines should continue to stay up and be reachable at this time. However creating/starting/stopping machines, running new deployments, or other operations that rely on the machines API remain unavailable.
Investigating
We are investigating a major outage of our control plane. Apps may continue to run, but it is not currently possible to log in to the dashboard or use the Machines API.