Incident History

k6 browser tests aborted by system

This incident has been resolved.

1731517136 - 1731525172 Resolved

Grafana Cloud Prometheus - Unhealthy Ingesters

We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

1731009251 - 1731009810 Resolved

Confluent Cloud Latency and High Error Rates

Confluent Cloud has resolved the issue with their Metrics API, which was causing gaps in our metric data. As a result, our service is now fully restored, and data flow is back to normal. Thank you for your patience.

1730744600 - 1730798959 Resolved

New and recently unpaused/unarchived Grafana Cloud instances unable to start.

The incident has been resolved. We applied an update to all Grafana Cloud instances, which inadvertently restarted instances regardless of whether they were active or not. This caused heavy load on our control plane, causing slower startup times.

1730286710 - 1730302135 Resolved

Degraded read performance for Hosted Metrics prod-us-east-0 cluster.

At this time, we are considering this issue resolved since the read path has remained stable after the mitigation.

1730237338 - 1730241801 Resolved

Synthetic Monitoring Outage

On October 28, 2024, from 10:35 UTC to 11:25 UTC, our engineering team detected a major outage affecting Synthetic monitoring in the GCP US Central region.

During this time, users may have noticed gaps in their synthetic logs and metrics.

We’re happy to confirm that the issue has been resolved as of 11:25 UTC. Thank you for your patience.

1730149895 - 1730149895 Resolved

Grafana Cloud having a datasource with PDC enabled using localhost address are getting 403 status responses.

The incident has been resolved.

1730119782 - 1730213269 Resolved

MS SQL Data source unavailable for production instances on Daily or Instant RRC

Engineering has released a fix and as of 19:20 UTC, customers should no longer experience any issues with the MS SQL plugin. At this time, we are considering this issue resolved. No further updates.

1729177280 - 1729192901 Resolved

Tempo Query Failures in AWS Clusters

The fix has been rolled out and issue has been reoslved.

1728731784 - 1728735242 Resolved

Degraded Reads in Loki

At approximately ~11:30 UTC some customers in the prod-us-east-0 region experienced either extremely slow Loki queries, or Loki queries not returning at all. This lasted until ~ 15:30 UTC when the fix was rolled out.

1728663348 - 1728663348 Resolved
⮜ Previous Next ⮞