Stoat Status Machine

Email provider under heavy load

  • Identified
    Our domain has been successfully removed from the Spamhaus DBL blocklist. Email deliverability has been restored, and messages should no longer be rejected.
    
    However, due to a backlog of queued messages, our email service provider is still processing a higher-than-normal volume of mail. As a result, some emails may experience delays of approximately 20 minutes.
    
    We will continue to monitor the situation and provide further updates as needed.
  • Identified
    We are aware of mail deliverability issues, we are currently listed on Spamhaus DBL and are trying to actively resolve the issue.
    
    Some emails may also be delayed, this is likely due to the sheer volume of requests going to our service provider. We have mitigated this by increasing verification email expiry.

There's simply too many people, but we're trying our best. 🙂

  • Identified
    Just saw a burst of users coming online, attempting to re-scale.
    Update: Scaled, monitoring performance, we may have to go
    Update: Trying to push throughput even further
    Update: We're at an architectural limit, going to temporarily disable some events (incl. typing indicators and user updates) to help ease congestion while an actual fix is being put together
  • Identified
    I suspect we are hitting limits with our message pubsub, a solution is being put together.
    Update: Scaled vertically for now.
  • Monitoring
    Production services are now scaled up.
    There is a possibility of hitting further bottlenecks but we should be okay for a moment.
    Will continue to monitor and improve the deployment pattern.
  • Identified
    Ordered more servers, waiting for fulfillment. Service is generally stable right now, but more load is expected either today or tomorrow peak hours.
  • Identified
    Single-node cluster deployment was successful, now scaling it up.
  • Identified
    Deployment is taking a little longer than expected, but I would expect this to take less than an hour to resolve.
  • Identified
    We are currently working on scaling up our services.

Unexpected status code 500

  • Resolved
    Issue appears to be resolved. We are continuing to monitor for further issues.
  • Monitoring
    We are monitoring the issue.
    
    Track external service provider incident here: https://www.cloudflarestatus.com/incidents/8gmgl950y3h7
  • Identified
    The issue appears to be caused by an external networking services provider.
  • Investigating
    We are aware of occasional connectivity issues and are pinpointing the cause.

Server maintenance affecting voice chat, media proxy, Discover

  • Resolved
    Maintenance is complete, all services are back online.
  • Identified
    Services will be temporarily unavailable during this maintenance period.
    Expect resolution within the hour.

Email verification delays

  • Resolved
    The external provider has fixed the problem. We will continue to monitor the situation, however emails should now be sent in a timely manner.
  • Identified
    The issue appears to stem from an external service provider. We are continuously monitoring the situation.
  • Investigating
    Some users are experiencing delays in receiving verification emails, for example when creating an account.