Sentry's RabbitMQ architecture?

Just curious how you have (were?) been running the RabbitMQ cluster.

I looked at the scale and code of Sentry, and I could not identify how Sentry achieved high-availability with RabbitMQ. Maybe you could make a blog or something describing the layout? I could not find any good resource online for a High Availability setup for RabbitMQ at scale, so I think it would be beneficial to the community as a proven design pattern.

No pressure in case this is a business secret. :slight_smile:

Nothing special, in production we just use federated queues.

Celery supports initializing a fallback broker inside the BROKER_URL if you specify the broker URL as a list of strings.

Sentry’s internal monitoring breaks with the lists, however, and we need to initialize it with a single broker url string.

Thankfully, celery also supports multiple brokers in a single string as long as they’re separated by a semi-colon.

We give the broker url as "amqp://user:password@host1/vhost;amqp://user:password@host2/vhost".

This causes Celery to be happy and start up. It causes Sentry’s “Monitoring” to think that we have

scheme: amqp
user: user
password: password
host: host1
url: vhost;amqp://user:password@host2/vhost

This causes us pain whenever we open the administrative UI, but generally the application works well.

I’m curious to know how the RabbitMQ was setup so that Sentry can continue to discover the nodes even when the broker crashes.

Do you have an additional load-balancer / high-availability tooling that keeps the singular broker url pointing to a single broker, active at all times?

Yeah, we don’t use anything built into celery for this. We do routing through haproxy or envoy.

From application servers, they all have a local haproxy or envoy which is used for routing outbound connections. In this case, we just round robin between the brokers from there.

Ultimately our broker url is something over

1 Like

Ah gotcha… Yeah that seems about right. Thanks.