Yesterday, I was in Sentry looking at a specific issue, and it had “1.2k” events.
Today, I’m looking at the same issue, and it only has 76 issues.
Overnight (literally) almost all of the events from this issue were disappeared. Interestingly, the events that are missing are the RECENT events — we had a production incident (most events are from the last day) — older events (the 76) are still there.
What are all the possible reasons that this could happen?
I hope to get an exhaustive list. We’ve already looked at the obvious (to us) things — checked our Sentry configuration, looked at app logs, Postgres logs, and Redis logs, checked VM disk usage, etc. Our best guess/speculation is that maybe Redis collects new events and then flushes them to Postgres, and that it failed (for some reason), but we can find no evidence that happened (unless we’re looking at the wrong logs).
I’ve attached a screenshot that demonstrates (hopefully, at least a little bit) that I’m not mistaken about seeing the event count decrease.
Another thing to notice about this screenshot: the oldest issue(s) were disappeared too — notice it changes from “2 months old” to “a month old”. So, it’s mostly the newer events that are missing, but at least one older one too.