Tempo Write Path Failing
This incident has been resolved.
This incident has been resolved.
This incident has been resolved.
This incident has been resolved.
This incident has been resolved.
Due to this bug reported in https://github.com/kubernetes/kubernetes/issues/127370, we were affected by an issue causing K8S service endpoints not getting updated when pods are stopped/started if there are more than 1k pods matching the service. This caused a temporary outage in Mimir gossiping services, which further resulted in failures to ingest and query metrics for a short time. This issue has been resolved.
This incident has been resolved.
We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
This incident has been resolved.
Rollback has been completed as of 17:20 UTC. At this time, we are considering this issue resolved. No further updates.
This incident has been resolved.