Sentry high availability deploy

There is almost no information on the high availability of Sentry deployment on the Internet. I have recorded my process of deploying Sentry to Kubernetes for your reference:

https://jiankunking.com/sentry-high-availability-deploy.html

Nice. Will there be an English version?

It is a nice read-up based on my translation via Google Translate. That said at this point (after all that effort), why not just use https://sentry.io?

Also, there’s an unofficial k8s chart here: https://github.com/sentry-kubernetes/charts/

There are two main reasons for not using sentry. IO:

  1. We want to connect Sentry with our system, such as projects, applications, permissions, etc

  2. Think about it based on cost

There should not be an English version, after all, my English is not good。

I’d love to learn more about this as sentry.io supports SSO/SAML and internal integrations so it should not be a blocker for adoption.

If you need HA, I’d be very suprised if running and scaling your own instance of Sentry costs less than using the hosted service. Even if they cost the same, you always get the newest version of Sentry and dedicated support. Am I missing something here?

Not to force you but your English seems quite well to converse here. It might attract a wider audience and I’d be happy to proof-read it once you have a draft :slight_smile:

2 Likes

Just junmping in , I’ve been using sentry on kubernetes for like a year and still trying to master it but overall all look good so far. We had setbacks and really took some time to figure it out. I really hope there is a topic here so that some engineers from the community or users can share their best practices. I mean the official architecture docs just give me a general ideas but I would really appreciate if there’s any dive in blog posts introduce the in and outs so that I can tell how one components will affect the others and how we can optimize and tune for better performance.

Some questions like:
event → relay → kafka → ingest-consumer → kafka snuba → snuba consumer → clickhouse

    1. So looks like events are stored in clickhouse right ? Then what stores in postgresql ? like project configs or sth? We observe a growth in postgresql storage and cronjob cleanup doesn’t help , it’s kinda weird if events not in psql and took that much space.
  • 2.What stores in redis? I know there are counters or maybe snuba queries caches in redis , but anything valuable ? Is it safe to purge the dump if needed? We have some incidents when kafka is down or ingest down then redis overloaded so I guess redis has something to do in the ingest workflow but how ?
  • 3.Similar to above , I’ve opened another topic before
    Sentry worker stop working (rabbitmq connection issue?)