The sentry was working well and all of a sudden got some issue on kafka with following issues: Exception: KafkaError{code=OFFSET_OUT_OF_RANGE,val=1,str="Broker: Offset out of range"}
I am running sentry on premise inside docker containers. How can I fix this Kafka issue where should I reset the offset range or make sure it does not come up again.
Looks like you had a burst of events or the system Sentry running on was not able to consume the messages as fast as they were produced. This answer may help you: https://stackoverflow.com/a/36472296/90297
I’ve isolated the environment and am attempting to sent a single crash as a test, and it is throwing the same error.
ingest-consumer_1 | 12:22:27 [INFO] batching-kafka-consumer: Flushing 1 items (from {(u’ingest-events’, 0): [52093L, 52093L]}): forced:False size:False time:True
ingest-consumer_1 | 12:22:27 [INFO] batching-kafka-consumer: Worker flush took 20ms
snuba-transactions-consumer_1 | 2020-10-02 12:22:28,961 Completed processing <Batch: 1 message, open for 1.00 seconds>.
7e3b076e5ce8_sentry_onpremise_snuba-outcomes-consumer_1 | 2020-10-02 12:22:28,984 Completed processing <Batch: 1 message, open for 1.02 seconds>.
snuba-consumer_1 | 2020-10-02 12:22:28,986 Completed processing <Batch: 1 message, open for 1.03 seconds>.
(bunch of post-process-forwarder-1 stacktrace)
…
post-process-forwarder_1 | File “/usr/local/lib/python2.7/site-packages/sentry/eventstream/kafka/backend.py”, line 195, in run_post_process_forwarder
post-process-forwarder_1 | raise Exception(error)
post-process-forwarder_1 | Exception: KafkaError{code=OFFSET_OUT_OF_RANGE,val=1,str=“Broker: Offset out of range”}
sentry_onpremise_post-process-forwarder_1 exited with code 0
There are no events currently being processed.
the Kafka service is not reporting any errors. Only the post-process-forwarder. It seem to constantly be in a crash loop itself. with it constantly restarting. Doesn’t seem to be able to recover.
Should I delete and recreate the specific volume this service is using?
I honestly don’t know what to do here. Seems like somehow this process got out of sync with Snuba producers and consumers. I don’t think this consumer has a specific volume you can reset.