Incident History

Tempo Ingester Restarting

From 18:46:16 to 18:46:26 UTC, we were alerted to an issue that cased a restart of Tempo Ingestors in the US-East region.

During this time, users may have noticed 404 or 500 errors in their agent logging, potentially resulting in a small amount of discarded Tempo traces for the time the ingesters were not available.

Our Engineers were able to identify the cause and a solution was implemented to resolve the issue. Please contact our support team if you notice any discrepancies of have questions.

1727127322 - 1727127322 Resolved

Temporary disruption of service for MultiHTTP and Scripted Synthetic monitoring checks on all regions.

A transient error in our infrastructure caused all public probes to report MultiHTTP and Scripted checks as failures for roughly 5 minutes, from 9:55 UTC to 10:00 UTC. The error has been addressed and all probes should now be operating normally.

1726654312 - 1726654312 Resolved

Synthetic monitoring DNS failures, Frankfurt probe location

From 13:49 - 13:54 UTC a deployment to the Frankfurt probe location caused DNS resolution timeouts affecting DNS, HTTP, MultiHTTP, and Scripted checks with failure rates of 20-50% during this time. After a rollback by 13:54 synthetics tests returned to normal.

1725922477 - 1725922477 Resolved

Latency and connectivity issues, synthetic monitoring, Seoul probe location

From 21:30-22:45 UTC, the Seoul public probe location experienced connectivity problems. We observed failure rates of 10-15% across PING, DNS, HTTP, multiHTTP and k6 scripted checks due to connection timeouts. The issue has cleared but we continue to monitor.

1725672253 - 1725672253 Resolved

Issues accessing SLO's, ops-labels, and log export configs

This incident has been resolved.

1724776842 - 1724783904 Resolved

Issue with on-call permissions

This incident has been resolved.

1724700329 - 1724708723 Resolved

Synthetic Monitoring: New York probe service degradation

This incident has been resolved.

1724663172 - 1724669069 Resolved

Some Grafana Cloud Instances Intermittently Unavailable

This incident has been resolved.

1724382113 - 1724404578 Resolved

Investigating Issues with Instances being unable to Start or Restart

This incident has been resolved.

1724326703 - 1724342862 Resolved

Degraded read performance for Hosted Logs prod-us-central-5 cluster.

The incident was resolved and the affected cell is back to normal. We've identified the root cause of the high latencies and took counter-measures to mitigate the performance problems.

1724233950 - 1724240392 Resolved
⮜ Previous Next ⮞