Clickhouse - How to recover data from sentry db

Hi There,

I run sentry on-premises in kubernetes.
I recently experienced a space issue with clickhouse, which runs internally in the kubernetes cluster. I ended up removing the persistent volume claim (PVC) to be able to expand the volume, the volume expanded however I noticed the old volumes were removed and sentry events and performance data from the date before I expanded the volume are not showing up in sentry.

I’m wondering if there is a way to get the data back from the sentry database, which runs externally.

Can someone please assist?

Thanks!

@fpacifici @lynnagara is there any way to backfill Clickhouse from node store/postgres?

The only way i can think of recovering the events is possible if you have them in kafka. What is the retention policy of the snuba-related kafka topics ? If you have events in kafka, you can do something like this:

kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --topic <topic_name> --reset-offsets --to-earliest --execute

This will execute the reset and reset the consumer group offset for the specified topic back to earliest kafka msg you have in kafka. You’ll need to this for all snuba related topics.

This way snuba consumers will reread those events and start inserting them to CH.

@chhetripradeep @BYK thanks for your responses.

I reseted the offsets on all of the below kafka consumer groups to the previous month, which is within the retention policy using the --to-datetime parameter of the kafka-consumer-groups script/command.

snuba-post-processor
snuba-counsumers
ingest-consumer
transactions_group
snuba-replacers

The offsets were reset successfully on the events topic, i notice the LAG, the difference between CURRENT-OFFSET and LOG-END-OFFSET increased after executing the offsets, which was expected, i then restarted the clickhouse pods/statefulset. However the performance data is still not showing up before the date the volume was expanded so it doesnt look like it worked.

Any other assistance will be appreciated

Then my guess is there’s an issue with the Kafka topic creation or partition setup. If you can afford data loss, you can try deleting and recreating kafka and zookeeper volumes (the nuclear option)

@BYK Can you please tell me what data would be lost if the kafka and zookeeper volumes are recreated?

We already can’t see the performance data prior the start of October.

Thanks!

You’d lose all in-flight data that is not yet processed. That means you will not lose anything you already see on the UI but any events that are waiting to be processed will be gone.