Sentry 10 High Availability

Hi,

We’re considering using Sentry 10 to collect events from all our applications.
We strongly need High-availability (mainly for event ingestion) and we’re wondering how to achieve that.

Obviously, the install script deploys all parts on a single host.
So do we need to install using home-made scripts if we want to distribute the components over multiple hosts?

Is there some official recommendations for HA?

  • Is it possible to deploy some parts only?
  • Is it possible to configure Redis/ Zookeeper / Kafka in cluster?
  • Which parts can be scaled up for high-load / high-availability?

Has anybody already set up a Sentry 10 in that way?

1 Like

Yes, it is called sentry.io :smiley:

The services in the docker-compose file are the minimum requirements with the exception of Symbolicator, if you don’t intend to process native crash reports.

Yes.

All of them? :slight_smile: I think you need to be more specific but I honestly want to ask why you are not considering our SaaS offering at this point as I’m quite sure it would be cheaper and more reliable overall at the end.

Thank you very much for your answer.

You’re right, your SaaS option would likely be more reliable and cheaper.
But we’re working for a company processing payment data (card numbers…) so we must enforce strong security rules (especially for PCI-DSS compliance)

That’s why we’re interested in the on premise option which will allow us to run the Sentry service and to benefit its integration capabilities (git repository, jira…).

I doubt that our CISO validate a solution where we authorize a cloud service to interact with our Git, JIRA…

Is it possible to deploy some parts only?

My question was unclear; it was about the install script

As I understand, this script only allows to deploy all parts on a single host (Docker Compose based).
What’s the official way to install a “more customized & distributed” Sentry?
For example:

  • Deploy postgres as Master/Slave
  • Deploy 2 or 3 redis nodes
  • Deploy a 3 nodes Kafka cluster
  • Deploy 2 web servers

Do we need to code all the install scripts by ourselves?
(db migration…)

Since I have this on my plate Q2 latest too, we could cooperate in a GH repo or something, and provide others with building blocks without putting the burden on the Sentry team, while also sabotaging their lifelihood. :wink:

I guess all the people going for on-premise have data-on-premise as their main driving force.

We are PCI-DSS compliant though: Data Security, Privacy, and Compliance Overview

That sure is up to you but many other folks, including Jira itself (well, Atlassian) use Sentry :slight_smile:

Yes, the setup we support and advertise is geared towards simplicity rather than scale. There is no “official” way for a customized Sentry as it would need to change based on your needs. I’d recommend starting small and simple, watch the stats and then tune and scale accordingly. The key bottlenecks would be Kafka, Clickhouse, workers, and Redis I’d say. Postgres should be less of a problem after moving the events off of it (which we already did).

Do we need to code all the install scripts by ourselves?

If you are starting fresh, I’d argue you need very little of it. You’d need the sentry upgrade and Snuba initialization commands but rest wouldn’t be needed. You can use the docker-compose.yml file as a blueprint for how the services connect together.

Hi @BYK

When you say…

We are PCI-DSS compliant though: https://sentry.io/security/#pci-dss

I think you mean that my payments to Sentry for Sentry SaaS are PCI compliance and processed by Stripe.

What about sending events from my application to Sentry SaaS related to card payments? Is Sentry SaaS PCI compliance from this perspective?

1 Like

Sentry does not typically receive credit card data, making it compliant with Payment Card Industry Data Security Standards (PCI DSS) in most situations. Sentry also automatically scrubs data that looks like credit card information via its Data Scrubber feature, which is enabled by default.

This is the part that is relevant to you I think (emphasis mine).

I’m happy to get you in touch with more informed folks as this is about the edge of my knowledge :slight_smile:

Hi Burak,

Thank you very much for your response.

I would really appreciate if we could contact your team colleagues mentioned for specific questions related to this topic.

image002.png

Hi @BYK,

We did some tests to see what the Server Data Scrubber can filter.
We sent Sentry events containing various Stripe tests card numbers and all of them have been ingested in Sentry as is; did we miss something?

The Data Scrubber option is enabled on our test project (included default scrubbers).

Our log lines look like:

Did you see my pan? (4242424242424242)
Did you see my pan? (4242 4242 4242 4242)
...

Hello @fmartinou, we do not scrub from the event message/error message yet, only from context data and Additional Data. We are in the middle of revamping this feature to allow for more customization in what to strip and are looking for alpha testers. Please refer to https://github.com/getsentry/relay/issues/453 for further information.

1 Like

I got it! Thank you for your answer!

You may refer to this article:
https://jiankunking.com/sentry-high-availability-deploy.html

If you can use Kubernetes, check out this unofficial helm chart: github.com/sentry-kubernetes/charts
It’s fairly well maintained but not an official Sentry-certified deployment method