Thus I have a working Sentry with the following containers and their respective links:
sentry-redis : not linked
sentry-postgres: not linked
my-sentry : linked to sentry-redis and sentry-postgres
sentry-cron: linked to sentry-redis and sentry-postgres
sentry-worker-1: linked to sentry-redis and sentry-postgres
I understand that my-sentry acts as the server, sentry-cron acts as the celery beat and sentry-worker-1 acts as a worker.
Now I believe the server, my-sentry, is the one receiving the requests and creating tasks for sentry-cron to dispatch to the sentry worker(s).
But looking at how the containers are linked together, I can see that no direct connection occur between the server and the cron, nor between the worker(s) and the cron. The only link between them is indirect, through postgres or redis.
Could it be that Sentry uses Redis as a message transport?
Could anyone expand a bit on the matter please, or even point me to some docs or code where I could see how that works?
Thanks!
Edit: I have seen this page on the doc: https://docs.sentry.io/server/queue/ that says Redis is the default broker. But yet if anyone could confirm if my understanding is correct that would be nice!
Then is it correct that when an event comes in, from say Raven, it is caught by src/sentry/web/api.py in the class called StoreView?
In which the method process sends a signal called event_accepted with ip, data and project?
Is this sent to the queue? Because I fail to see who catches this signal and how it ends up in Redis (or whatever the backend the queue uses).
You’re mostly correct. Except the signal is not what inserts it into the queue. The signal is just an event that we can hook into for other side effects.
Ho yes of course! I saw it and didn’t even realize it was the one
Thanks a ton, I will look at what it does.
My initial concern was mostly how to deploy the infrastructure so I wanted to understand who is connected to what and in what way. And also, what services can be duplicated for higher availability so I wanted to know how the workers expected to be fed their tasks.
Now I think I understand redis is the link. We need enough memory for the tasks to fit. We can have various cron for higher availability and the main scaling will happen by firing more workers.
I guess we could even fire various servers for incoming Raven events, and we could even have various servers for the dashboard, as long as it is connected to the same postgres. Or maybe there are some possible concurrent calls that would break things?
Not that I need any of this scaling but I like to understand how Sentry works.
Yeah, worker and webs can be scaled entirely horizontally. cron shouldn’t be. Ideally you run one cluster wide. You can technically run two, but it’ll be duplicating work in some cases.