Incident History

Tempo Query Failures in AWS Clusters

The fix has been rolled out and issue has been reoslved.

1728731784 - 1728735242 Resolved

Degraded Reads in Loki

At approximately ~11:30 UTC some customers in the prod-us-east-0 region experienced either extremely slow Loki queries, or Loki queries not returning at all. This lasted until ~ 15:30 UTC when the fix was rolled out.

1728663348 - 1728663348 Resolved

Grafana OnCall degraded perfomance

This incident has been resolved.

1728652530 - 1728656364 Resolved

k6: Operational issue

This incident has been resolved.

1728331384 - 1728333357 Resolved

Slow access to OnCall UI and delayed notifications for prod-us-central-0 cluster

Engineering has released a fix and as of Thu Oct 3rd, around 23:00 UTC, customers should no longer experience delays while accessing OnCall UI. At this time, we are considering this issue resolved. No further updates.

1727904170 - 1728037151 Resolved

Synthetic monitoring failures, London probe location

This incident has been resolved.

1727839442 - 1727880299 Resolved

Grafana Cloud Partial Outage

No further instances of this issue have been detected following our mitigation. Customers should no longer experience any downtime. At this time, we are considering this issue resolved. No further updates.

1727811941 - 1727827424 Resolved

Synthetic Monitoring: London and Mumbai probes temporary disruption of service

A transient stability issue in our infrastructure caused public probes to report MultiHTTP and Scripted checks as failures in:

The error has been addressed and all probes should now be operating normally.

1727781277 - 1727781277 Resolved

Grafana Cloud Partial Outage

Between 0016Z and 0120Z and 0433Z and 0530Z a cloud networking component facilitating cross-region communication to and from this region experienced an outage. Users experienced errors modifying access policies in addition to elevated error rates for those who had recently modified an access policy. This also resulted in false positive synthetic monitoring alerts for probes in this region.

We're continuing to work with our cloud service provider to determine a root cause for the outage of this component.

1727414032 - 1727428068 Resolved

Grafana Cloud Partial Outage

Engineering deployed something which broke a component managed by one our cloud service providers. We rolled that back and the component now works. Customers should no longer experience nodes not fully booting. The cloud service provider is still trying to figure out why it broke. At this time, we are considering this issue resolved. No further updates.

1727189109 - 1727193872 Resolved
⮜ Previous Next ⮞