Incident History

Some Dashboards in Prod-Us-Central-3 unable to load

This incident has been resolved.

1769621225 - 1769631927 Resolved

Grafana OnCall and IRM Loading Issues

We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

1769546263 - 1769559750 Resolved

Grafana Cloud instances unavailable

This incident has been resolved.

1769509059 - 1769512484 Resolved

Increased write error rate for logs in prod-us-west-0

We were experiencing increased write error rate for logs in prod-us-west-0 from 6:55 to 7:15 UTC. We have since observed continued stability and are marking this as resolved.

1769500179 - 1769500179 Resolved

Upgrade from Free → Pro failing for users

Engineering has released a fix and as of 00:13 UTC, customers should no longer experience issues upgrading from Free to Pro subscriptions. At this time, we are considering this issue resolved. No further updates.

1769460836 - 1769472834 Resolved

Investigating Issues with Email Delivery

This incident has been resolved.

1769182670 - 1769193885 Resolved

Synthetic monitoring secrets - proxy URL changes

The incident is resolved. We are in contact with customers affected by this change.

1769030167 - 1769120955 Resolved

Hosted Traces elevated write latency in prod-us-central-0 region.

We consider this incident as resolved since the latency hasn't been elevated since the fix was applied. The issue was caused by a latency spike in a downstream dependency, causing an increased backpressure on the Hosted Traces ingestion path, which degraded gateway performance and resulted in an elevated write latency. After clearing the affected gateway services the degraded state went away and normal operation was restored.

1769001874 - 1769008903 Resolved

Incident: Metrics Querying Unavailable in EU (Resolved)

Impact: Between 14:30 and 14:38 UTC, some customers in prod-eu-west-2 may have experienced issues querying metrics. During this time, read requests to the metrics backend were unavailable, resulting in failed or incomplete query responses. The root cause of the issue was identified and addressed.

Resolution: The affected components were restored, and service was fully available by 14:38 UTC. We have taken additional steps to prevent this type of disruption from occurring in the future.

Next Steps: We are reviewing monitoring and safeguards around this failure mode to further improve reliability.

1768836652 - 1768836652 Resolved

Degraded Writes in AWS us-east-2

This incident has been resolved.

1768649293 - 1768785681 Resolved
⮜ Previous