Autoscaling and data location

Which Sentry components must be externalized to single instances so that the remaining components can scale up and down as needed?

We have been running Sentry for years on AWS Elastic Beanstalk. Our PostgreSQL database is an external AWS RDS instance, Redis is an external AWS ElastiCache instance, and the rest of Sentry is Docker containers running on an EC2 instance that Elastic Beanstalk can scale up or down depending on the volume of requests.

We are working out the process to upgrade Sentry from version 9.1.2 to 21.2.0 and must contend with the new Sentry architecture where data exists in more places. I saw a similar discussion in this forum about not losing historical data after a migration, but my question is which components of the [architecture] (https://develop.sentry.dev/architecture/) can’t be duplicated during a scale up.

With our event rate of a few hundred events per minute, we might be able to get by with just one of all the containers called out in the docker-compose.yml file, but should we have to scale up, which containers will cause problems if more than one of them is running?

We can probably live with small delays from cached data being in the wrong container when there are multiples, but want to avoid data loss and need to avoid problems that affect the overall stability of Sentry.

Thanks for any guidance or insight you can give us.

I think the biggest hurdle would be to scaling Clickhouse to multiple nodes. We are in the process of adding support for this but I think @lynnagara can provide more context there.

Other than that, as long as everything is sharing the same core data plans such as Redis and Kafka, this should be fine.

You may need/want to add more and dedicated workers to handle to load better: https://develop.sentry.dev/self-hosted/troubleshooting/#workers

Also take a look at GitHub - sentry-kubernetes/charts: Easily deploy Sentry on your Kubernetes Cluster which may inform some of your decisions.

1 Like