Exporting stats to an external time series db

We run one instance of sentry on premise 20.8.0 for applications from different teams. We limit the number of events per minute, but sometimes one application puts so much error events, that there is the potential to loose the events of other applications.

Therefore i would like to push the stats about rejected, received and blacklisted events per project to our time series databases so i configure alerting for this stats.
The http api from sentry offeres this information but for a large number of projects it is inconvenient and slow because it results in a lot of http requests.

Is there any faster possibility to grep this stats? I did not find any of this stats in kafka or redis.

How do you monitor your sentry on premise instance?

There is an internal metrics system you can wire up which we use for a variety of things:

https://develop.sentry.dev/services/metrics/

Great! I got the sentry.metrics.statsd.StatsdMetricsBackend to push the metrics through the telegraf-statsd-plugin to our time series databases.

Is there an overview for the existing metrics and what they mean?
Where can i find the stats for rate limited, filtererd, accepted events per project?

@marbon87 The stats on event ingestion are generated in the Relay service. You can find documentation on the metrics and configuring metrics preliminarily on this page:
https://getsentry.github.io/relay/configuration/metrics/

To configure this, add the following lines to your onpremise relay/onpremise.yml:

metrics:
  statsd: 127.0.0.1:8126
  prefix: mycompany.relay

You’ll particularly be interested in the following metrics described on the above page:

  • event.accepted
  • event.rejected
  • events.outcomes (logs more reasons for rejected events):

The documentation outcomes isn’t great at the moment, but the reported tag values are pretty self-explanatory: filtered, rate_limited, invalid and abuse.

2 Likes

thanks for the hint. What i am missing here are tags that show the organization and project key.

Sorry, I should’ve read your question more carefully. We’re not tagging the organization or project on those metrics. In order to get stats per project or organization, you have a few options:

  1. Consume the organization stats and project stats endpoints. You can see an example for this when you navigate to “Stats” in the main page.
  2. These stats are powered by our Snuba service, which uses Clickhouse as a data store. Code for querying Snuba can be found at https://github.com/getsentry/sentry/blob/36a4217e55008ce59787261869322977e91179ce/src/sentry/tsdb/snuba.py.
  3. Snuba uses the outcomes dataset, which is populated from the outcomes Kafka topic. The messages are JSON payloads containing organization, project, DSN key, outcome, and reason.

Considering you would like to process a large number of projects, my suggestion would be to write a Kafka consumer for the outcomes topic. For example, have a look at Snuba’s own consumer.