Sentry stops registering events after a while

Hello there :slight_smile:

We have recently upgraded our Sentry self-hosted installation recently. We went from 20.X.X to the latest 21.6.0.dev0 (d7e490e). Since that upgrade sentry refuses to process any incoming events after a while.

We can send requests and they are received properly, the client even gets 200 response. But the events are not showing up in the UI. If I restart the docker containers it works fine again for a while and then after about half a day, it stops working again.

I am on the latest sentry/on-premise version and we did not touch any configuration except for the bare minimum.

Since I assume sentry itself works fine, I guess that the issue is caused by the update.
Is there anything we can do to fix this, without nuking our server and losing all data?

While digging though the logs it seems that Kafka is down for some reason, I cannot figure out why though, We do have enough server resources. We do not fully meet the minimum require hardware specs, but we do not get that many events so I do not think that is a Problem, we also have enough storage space.

We do have a reverse proxy (linuxserver/swag) pointing at sentry but that should not be a problem…

Docker stats

root@02:~$ docker container ls
CONTAINER ID   IMAGE                                  COMMAND                  CREATED      STATUS                  PORTS                                      NAMES
5e637e641123   nginx:1.16                             "nginx -g 'daemon of…"   7 days ago   Up 7 days               0.0.0.0:9000->80/tcp                       sentry_onpremise_nginx_1
af077e406731   getsentry/relay:nightly                "/bin/bash /docker-e…"   7 days ago   Up 7 days               3000/tcp                                   sentry_onpremise_relay_1
e54a6057945e   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up About a minute       9000/tcp                                   sentry_onpremise_ingest-consumer_1
26c98571fd54   sentry-cleanup-onpremise-local         "/entrypoint.sh '0 0…"   7 days ago   Up 7 days               9000/tcp                                   sentry_onpremise_sentry-cleanup_1
1e5230e0d9da   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up 7 days               9000/tcp                                   sentry_onpremise_cron_1
f69a121b213e   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up About a minute       9000/tcp                                   sentry_onpremise_subscription-consumer-events_1
9eb5ac0e530c   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up 7 days               9000/tcp                                   sentry_onpremise_worker_1
24d0386d3a10   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up About a minute       9000/tcp                                   sentry_onpremise_post-process-forwarder_1
b2c8a48c11a7   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up About a minute       9000/tcp                                   sentry_onpremise_subscription-consumer-transactions_1
ee38c5f94f87   getsentry/sentry:nightly               "/etc/sentry/entrypo…"   7 days ago   Up 7 days               9000/tcp                                   sentry_onpremise_web_1
623fc9261bbc   snuba-cleanup-onpremise-local          "/entrypoint.sh '*/5…"   7 days ago   Up 7 days               1218/tcp                                   sentry_onpremise_snuba-transactions-cleanup_1
4d824e30baf7   getsentry/snuba:nightly                "./docker_entrypoint…"   7 days ago   Up 7 days               1218/tcp                                   sentry_onpremise_snuba-api_1
cb8e71eff8f1   getsentry/snuba:nightly                "./docker_entrypoint…"   7 days ago   Up 6 days               1218/tcp                                   sentry_onpremise_snuba-replacer_1
192f24835b79   snuba-cleanup-onpremise-local          "/entrypoint.sh '*/5…"   7 days ago   Up 7 days               1218/tcp                                   sentry_onpremise_snuba-cleanup_1
d2dc1851ee6a   getsentry/snuba:nightly                "./docker_entrypoint…"   7 days ago   Up 6 days               1218/tcp                                   sentry_onpremise_snuba-subscription-consumer-transactions_1
28e25be7093d   confluentinc/cp-kafka:5.5.0            "/etc/confluent/dock…"   7 days ago   Up 43 hours (healthy)   9092/tcp                                   sentry_onpremise_kafka_1
f6b011dfa3f3   confluentinc/cp-zookeeper:5.5.0        "/etc/confluent/dock…"   7 days ago   Up 7 days (healthy)     2181/tcp, 2888/tcp, 3888/tcp               sentry_onpremise_zookeeper_1
2f0d281fc7c2   yandex/clickhouse-server:20.3.9.70     "/entrypoint.sh"         7 days ago   Up About a minute       8123/tcp, 9000/tcp, 9009/tcp               sentry_onpremise_clickhouse_1
223f2878c79c   postgres:9.6                           "/opt/sentry/postgre…"   7 days ago   Up 7 days (healthy)     5432/tcp                                   sentry_onpremise_postgres_1
ff4c5567c6ec   symbolicator-cleanup-onpremise-local   "/entrypoint.sh '55 …"   7 days ago   Up 7 days               3021/tcp                                   sentry_onpremise_symbolicator-cleanup_1
0275992089c0   tianon/exim4                           "docker-entrypoint.s…"   7 days ago   Up 7 days               25/tcp                                     sentry_onpremise_smtp_1
5e86e986a222   ghcr.io/linuxserver/swag               "/init"                  7 days ago   Up 7 days               0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   swag
823351c2278a   memcached:1.5-alpine                   "docker-entrypoint.s…"   7 days ago   Up 7 days (healthy)     11211/tcp                                  sentry_onpremise_memcached_1
5fb912950eef   getsentry/symbolicator:nightly         "/bin/bash /docker-e…"   7 days ago   Up 7 days               3021/tcp                                   sentry_onpremise_symbolicator_1
dde53cbe822f   redis:5.0-alpine                       "docker-entrypoint.s…"   7 days ago   Up 7 days (healthy)     6379/tcp                                   sentry_onpremise_redis_1

Logs & Configuration files

Thanks for any help :smiley:
Patrick

Your logs suggest an issue with connectivity to Kafka. It maybe due to DNS (as it complains about name resolution) but it also mentions direct IP connection fails. I’ve heard systems struggle with the paravirtualized network when they are under load so maybe that’s the issue?

Yeah I noticed those entries too, but I doubt that it is a DNS issue since those “domains” are defined
by the docker-compose container. “kafka:9092” references the “kafka” container not a manually configured domain or something along those lines.

For some reason it has not been working for the past couple of days, don’t know why though.
However I will continue to monitor the situation and reply on this thread if something comes up.

Thanks for the help <3 :slight_smile:

Hi, Small update:

After debugging some more I was unable to find the issue…
I went with the nuclear options, exported the data, and reset the server.

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.