General workflow advice about error thresholds

I’m trying to move my team from “technically has Sentry configured” to “actually uses Sentry meaningfully”.

In an ideal world, errors would never happen and we should get alerted on every error, but that’s where we are. I’m trying to understand how to productively use Sentry in a world where many errors can be fine if they are not occurring frequently.

So getting notified on a brand new issue makes sense to me. But let’s say some issue comes in, and we look at it, and we say, ok, that’s a timeout talking to some random service we use, that’s fine if it happens once an hour, not great if it’s happening once a minute.

What do I actually do to act on that fact? I don’t see a way to annotate an issue in Sentry (through the web interface) with any sort of “alert threshold”. In fact I can’t annotate it in any way detectable by the alert rules.

I guess I can come up with a fixed set of static tags like threshold:10_per_minute, threshold:10_per_hour, etc, and set up a bunch of duplicative alert rules saying “if an issue with this tag has more than this amount in this span, then alert me”. And then in the code itself (hoping that the error comes from my own code and not library code) annotate the error with the appropriate tag. That would work but seems like a lot of hassle.

And even once I’ve done that, all I’ve done is make sure alerts work right — there’s no view that will be “show me issues that are over their threshold”.

I feel like I’m missing some obvious way to use this product. Am I?

Ohhhhh. I didn’t realize the “ignore” let you say “ignore until occurs again N times in X period”. That seems perfect.