Prometheus writes, Logs, and Synthetic Monitoring in prod-eu-west-3 are degraded


Incident resolved in 27h43m41s

Resolved

This incident has been resolved.

1774443128

Investigating

This is also now impacting Logs and Synthetic Monitoring in prod-eu-west-3.

For Synthetic Monitoring, users might observe errors pushing check execution metrics, and this can eventually lead to missing data. In addition, users might observe errors evaluating Synthetic Monitoring provisioned alert rule evaluations, and this can lead to missed alerts.

For Logs, there is no immediate impact on alerts, however, remote writes to Mimir is delayed which means users may see gaps in their recording rules.

1774424585

Investigating

We are moving this back to 'Investigating' as we are now observing a substantial drop in successful ingestion and increase in write path errors, and elevated rule evaluation latency and error. Reads are mostly fine. Our Engineering team is actively investigating this and we will provide further updates as our investigation progresses.

1774422276

Update

We have not observed any recent errors, but we will continue to monitor while we work with our CSP.

1774387391

Update

A fix has been implemented and we are monitoring the results.

1774343957

Investigating

We are currently experiencing degraded writes for mimir-prod-22 in prod-eu-west-3 since 08:45Z.

1774343307