Incident History

Synthetic monitoring secrets - proxy URL changes

The incident is resolved. We are in contact with customers affected by this change.

1769030167 - 1769120955 Resolved

Hosted Traces elevated write latency in prod-us-central-0 region.

We consider this incident as resolved since the latency hasn't been elevated since the fix was applied. The issue was caused by a latency spike in a downstream dependency, causing an increased backpressure on the Hosted Traces ingestion path, which degraded gateway performance and resulted in an elevated write latency. After clearing the affected gateway services the degraded state went away and normal operation was restored.

1769001874 - 1769008903 Resolved

Incident: Metrics Querying Unavailable in EU (Resolved)

Impact: Between 14:30 and 14:38 UTC, some customers in prod-eu-west-2 may have experienced issues querying metrics. During this time, read requests to the metrics backend were unavailable, resulting in failed or incomplete query responses. The root cause of the issue was identified and addressed.

Resolution: The affected components were restored, and service was fully available by 14:38 UTC. We have taken additional steps to prevent this type of disruption from occurring in the future.

Next Steps: We are reviewing monitoring and safeguards around this failure mode to further improve reliability.

1768836652 - 1768836652 Resolved

Degraded Writes in AWS us-east-2

This incident has been resolved.

1768649293 - 1768785681 Resolved

Degraded Writes in AWS us-east-2

This incident has been resolved.

1768636303 - 1768640694 Resolved

Partial Mimir Write Outage

This incident has been resolved.

Both read and write 5xx's and increased latency were experienced in the two periods: 23:56:15 to 00:32:45 UTC 00:55:30 to 01:36:15 UTC

1768523288 - 1768536016 Resolved

Connectivity issues for Azure PrivateLink endpoints.

The scope of this incident was smaller than originally anticipated.

As of 16:27 UTC our engineering team merged a fix for those affected and we are considering this as resolved.

1768401037 - 1768421871 Resolved

PDC Agent Connectivity Issues in prod-eu-west-3

We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

1768232684 - 1768242078 Resolved

Tempo write degradation in prod-eu-west-3 - tempo-prod-08

Engineering has released a fix and we continue to observe a period of recovery. As of 15:12 UTC we are considering this resolved.

1768208623 - 1768231563 Resolved

Write Degradation in Grafana Cloud Logs (prod-us-east-3)

Between 20:23 UTC and 20:53 UTC, Grafana Cloud Logs in prod-us-east-3 experienced a write degradation, which may have resulted in delayed or failed log ingestion for some customers.

The issue has been fully resolved, and the cell is currently operating normally. We are continuing to investigate the root cause and will provide additional details if relevant.

1768000082 - 1768000082 Resolved
⮜ Previous Next ⮞