Kafka offset issue - snuba-subscription-consumer-events

Thank you for providing a great project. Currently, I am using this project to scale out. Some kafka topic events such as events, outcomes, and ingest-events are divided into 10 partitions, and each partition works well by snuba consumers and workers. But there is one thing that doesn’t work. In the case of snuba-subscription-consumer-events, rebalancing continues with an offset error.

Caught OffsetOutOfRange('KafkaError{code=OFFSET_OUT_OF_RANGE,val=1,str="Broker: Offset out of range"}'), shutting down...
Traceback (most recent call last):
  File "/usr/local/bin/snuba", line 33, in <module>
    sys.exit(load_entry_point('snuba', 'console_scripts', 'snuba')())
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/src/snuba/snuba/cli/subscriptions.py", line 224, in subscriptions
    batching_consumer.run()
  File "/usr/src/snuba/snuba/utils/streams/processing/processor.py", line 109, in run
    self._run_once()
  File "/usr/src/snuba/snuba/utils/streams/processing/processor.py", line 139, in _run_once
    self.__message = self.__consumer.poll(timeout=1.0)
  File "/usr/src/snuba/snuba/subscriptions/consumer.py", line 120, in poll
    message = self.__consumer.poll(timeout)
  File "/usr/src/snuba/snuba/utils/streams/synchronized.py", line 217, in poll
    message = self.__consumer.poll(timeout)
  File "/usr/src/snuba/snuba/utils/streams/backends/kafka.py", line 400, in poll
    raise OffsetOutOfRange(str(error))
snuba.utils.streams.backends.abstract.OffsetOutOfRange: KafkaError{code=OFFSET_OUT_OF_RANGE,val=1,str="Broker: Offset out of range"}

What does snuba-subscription-consumer-events do, and where do I fix it to make it work?

kafka-consumer-groups --bootstrap-server xx.xxx.xxx.180:9092 --group snuba-events-subscriptions-consumers --describe

If you run the above command in the kafka container, the current offset does not move in the current snuba-events-subscriptions-consumers group, only rebalancing appears, and lag is accumulating.

You’ll need the steps 4 and/or 5 here: Post Process Forwarder - KafkaError "Offset Out of Range" · Issue #478 · getsentry/onpremise · GitHub

In the case of snuba-post-processor, I tried adjusting the offset, but it was not a fundamental solution. However, the command you recommended worked. This solved the offset misalignment associated with the events topic.

“command: run post-process-forwarder --commit-batch-size 1 --initial-offset-reset earliest”

In my opinion, adjusting the offset even in the case of snuba-subscription-consumer-events is a temporary solution, but is there any fundamental solution?

Or can’t you know why this is happening?

And this phenomenon doesn’t seem to directly affect the sentry service. Is it correct?

This is because your consumers probably fell behind and couldn’t catch up with the volume. The solutions are many:

  • Increase the number of consumers
  • Increase the CPU resources for the consumers
  • Increase Kafka retention period (means more disk and memory usage)
  • Reduce the load on the system

I think these are mostly for performance and metric alerts.