Where is bottlenect in sentry onpremise?

The sentry project is really fun. It helped me a lot with my growth as a developer. I have succeeded in separating all the images of the onpremise version you provide and making them separate instances. From now on, I want to scale out the point that could become a bottleneck, but is there any recommended part? For example, I think that in the previous version 9.1.2, workers were heavily loaded. So, which container will be burdensome in version 20? As a result of generating some traffic in version 20, I looked at the container through “docker logs” in the case of worker containers, and it seemed that they were not doing anything special. But relay, kafka, snuba, clickhouse, etc. were very busy. I’m curious about the potential bottlenecks in containers like kafka, ingest-consumer, snuba, etc. If I knew how to use k8s, I would have been able to handle sentry much better, but I’m still lacking in study.

If you are interested in using k8s w/ Sentry, definitely check out https://github.com/sentry-kubernetes/

Regarding potential bottlenecks, without any order:

  • kafka
  • worker
  • clickhouse
  • consumers
  • snuba

Would be my guesses. As you start increasing the load you’ll get to see which part fails first and then once that is resolved, you’ll hit the next one.

Unfortunately, since these are quite dependent on usage patterns, there are no general rules.

This graph would also help you a bit: https://develop.sentry.dev/architecture/

Hello. @BYK I have a question. Looking at the site below, I thought the red line was the flow of the Sentry Event. Then I understood that Kafka instead of Redis and snuba consumer instead of Worker are used, but Redis and Worker are both used with kafka & snuba consumers. Are both Redis and Workers used in 20.x.x versions other than 9.1.2? If used, what role does it play?

Yes they are. In the on-premise repo we use Redis as a message broker for Celery (workers). Workers are responsible for post-processing the events and attaching various extra context to them.

Thank you for your answer. However, I have a little more questions. Looking at the sites below, the role of Worker seems to be used only for the Web Dashboard. As you said, where in the second link, event pipeline will the step play a role?
The first link says that the path to the worker is a legacy path. Is it wrong?

The questions below are also relevant.

It is correct. Before Relay, we were pushing the events to the workers for processing. We still use the workers for post-process and other steps AFAIK though.