Due to popular demand, we’ve been working on adding server-side filtering of data. The simple version of this is discarding events from things like browser extensions, but under the hood the goal is to provide a lot more power over time. These apply before rate limits, and are part of a larger shift on our end to allow you to remove unwanted data as much as possible.
Right now we’ve got a few baked in filters (in addition to our already existing IP filters):
Legacy browsers [JavaScript only]
Web crawlers
Browser extensions [JavaScript only]
Localhost errors (e.g. from development environments)
If you’re running open source Sentry you’ll also be able to create your own filters. Though we’re not willing to commit to a stable API at this point, it’s unlikely to change, or to change much.
Here’s what it looks like today, within your project settings:
We don’t actually filter request URIs which are bound to local host right now. Today it’s purely based on user.ip. We could probably add that as it doesn’t seem like an issue.
This has come up enough already that we should probably do it.
Is there any possibility of a production service making a request to 127.0.0.1 (itself or another service on a different port) and triggering an error that would be suppressed via this filter? That would be my only concern.
We’d like to whitelist crawlers. We have the Inbound Filter for bad crawlers turned on, however, we find that there are some that generate a lot of errors on the site that aren’t caught by the Inbound Filter. However, we’d like to know if our system is failing for useful crawlers like Google/Bing/etc.
a company like ours lives or dies based on our search rankings, so it’s important to us to know if search crawlers are failing. we’ve turned off the sentry web crawler filter and have implemented our own log4j filter for now. More visibility in the UI as to what crawlers match would help too.
What happened to this feature? I am viewing the settings for my project, and I see tabs for General, Notifications, etc but I don’t see “Inbound Data Filters” anywhere. Was this feature deprecated? I am interested in excluding web crawlers, or at least having errors from web crawlers in a separate search.