I want to know about Kafka Streams from Sentry

Hello. I am currently analyzing Sentry for an update to Sentry 20 version. Meanwhile, the following questions arose.

-What is Kafka Ingest Stream?
-What is Kafka Event Stream? What’s the difference with Ingest Stream?

I searched the sentry repository to check this, but couldn’t find an answer. It looks like you’re using the Kafka Streams API. Can you figure out the specific role?

This may help you some more ideas: https://develop.sentry.dev/architecture/

I don’t think we are using the Streams API but I also don’t see how this is relevant for you? Our recommendation is to treat the on-premise repo as a black-box (that you wouldn’t change the internals, but inspect the internals) as it is subject to change as the product evolves.

The ingest stream is how Sentry “ingests” incoming events. The event stream, I am not sure. Maybe @jauer can provide a better explanation.

1 Like

@BYK Thank you for answer. I’d like to understand Sentry’s event flow so that I can clearly know where to improve when TPS increases. So I was curious about the overall working principle.
When I looked at the link below, I still had some questions, so I asked a question.

Ah, with this context it is easier to answer your questions. You can be rest assured that Relay and Nginx won’t be the bottlenecks you’d hit first. Usually you first need to scale the consumers and possibly Redis (RAM & Storage, you’re not likely to need too many instances). If the consumers cannot catch up, then Kafka will start causing issues with disk, memory, invalid offsets, and expired events (timeouts).

Around this time, it is likely that you’ll have some issues with Clickhouse (memory and storage). You may want to learn more about multi-node Clickhouse and Snuba setups but currently we do not have docs around this that we can share.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.