Sentry 20 extremely slow, ingestion of events takes hours

Fahr · January 19, 2021, 2:13pm

Hello all,

I recently upgraded from Sentry 9 to 20, running 20.12.1 now. It is a completely fresh install as I only had a handful of projects in Sentry 9 and this seemed easier.

It is deployed on Kubernetes using the official chart, using Redis as a back-end instead of RabbitMQ, as I did not want to needlessly install a whole bunch of extra stuff.

At first, everything was running fine even if the interface was incredibly slow (and still remains incredibly slow), but as more time passes, it appears that Sentry is getting slower and slower for no apparent reason.

When it was freshly installed, I generated some test events and they were instantly available. Right now, when I generate a test event, it takes well over 2 hours before it shows up.

Furthermore, I’ve noticed that certain Snuba pods in the cluster routinely go into crashloops. They eventually resolve, but it mostly seems down to them not being able to connect to Kafka for whatever reason. The Kafka pods show no errors at all.

This is a completely fresh install on a beefy cluster and it’s simply unworkable as it is now. Sentry 9 was perfect, this upgrade introduced a massive amount of extra dependencies and overhead and performs terribly.

Just for testing purposes, I’ve also deployed Sentry 20 to a different cluster, where exactly the same issues appear after a while.

Does anyone have similar issues and/or any idea of how to fix this? I cannot routinely wait hours for errors to appear…

untitaker · January 19, 2021, 3:28pm

Can you clarify what “official chart” means? We maintain getsentry/onpremise and nothing else.

Fahr · January 19, 2021, 3:49pm

Maybe less official than I thought? Though I doubt it matters much HOW it is deployed. There are clearly issues here with components not linking up properly and I would like to delve into that… why does it take 2 hours? What causes that? Is there some massive backup and if so where? What logs do I look at, how do I find out, etc.

untitaker · January 19, 2021, 4:11pm

It is absolutely not official and while I don’t discount the possibility that your issues have nothing to do with that helm chart, in principle your questions and how well Sentry “links up” do depend on how you deploy Sentry.

So far we don’t know very much as to which errors you’re actually encountering. When Snuba says it can’t reach Kafka, surely that means there’s a networking error between those, or that Kafka really is down?

In any case I would first try to report this issue against the helm chart itself and, if you have the time, attempt to repro the same issue using getsentry/onpremise (which is just a docker-compose)

Fahr · January 23, 2021, 5:59pm

I’ll take this up with the chart maintainers then. Kafka seems to be problematic across the board - I’m finding more and more people with issues running it in various contexts, so my guess is that the problem is somewhere there.

I’ll keep poking around and see what comes up.

untitaker · January 29, 2021, 9:40am

FWIW somebody on Sentry Discord was able to get onpremise running with Redpanda instead of Kafka, which appears to be more stable but also requires license keys. You can try it out though we don’t support that for sure.

system · February 13, 2021, 9:41am

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sentry stops registering events after a while On-Premise	4	2528	July 2, 2021
Sentry stops processing events after upgrade 10.0 => 20.8.0.dev0ba2aa70 On-Premise	52	11540	December 8, 2020
The best way to upgrade to latest Sentry version On-Premise	17	15340	February 7, 2021
After upgrade to 20.9 Sentry showing "info" level events only On-Premise	9	2647	September 16, 2020
Sentry 21.8.0 Helm Chart - Old events not showing up	0	987	January 21, 2022

Sentry 20 extremely slow, ingestion of events takes hours

Related topics