Hi, first of all, thanks for the amazing product that is Sentry, we have been using the 9.1.2 self-hosted version on our Swarm cluster for a while now and it’s been very helpful to improve the way we detect and handle errors!
I wanted to tackle the upgrade to v10 today, but was a bit thrown off by the new way of doing things. Am I right thinking that using the install.sh script would only ever work if it ran from the machine the current Sentry instance is running on, with it being already deployed only with docker-compose and nothing else?
If that’s the case, I reckon I’m not the first and won’t be the last to wonder how to install and upgrade Sentry v10 on a Swarm or k8s cluster. I initially thought I’d be able to just build my image with our custom configs in a GitLab-CI pipeline, and then simply update the image on our cluster, letting Sentry do the rest when it comes to migrations and all other stuff (like what GitLab is doing in its Omnibus package for instance), but I guess I was mistaken?
Nonetheless, Sentry really is a good product, and we are more than happy to tackle things a bit more manually to make it work. It seems a few unofficial Heml charts have already been published, but we are actually looking for a solution for a Swarm stack. Moreover, it would be great if we were able to migrate our existing 9.1.2 instance without losing any data or configuration.
Is someone willing to help a bit with this task? This topic could probably be helpful for others who are facing the same issue. Even just a global overview of the workflow we should implement, like steps to run from CI, how to deploy, handle migrations and so on, would be tremendously helpful.
Yup, this is the case unless you are using a remote Docker daemon. But the script mostly assumes local access.
Yup, you are neither the first nor the last person
This is also correct. All of the migration and other set up stuff is handled by the install script, not the Sentry image. I was entertaining the idea of Sentry running migrations automatically but couldn’t find an efficient and safe way to check this at every start up so shelved the idea.
Happy to say that this is possible. It’s just not a solution we provide out of the box.
I don’t have direct experience with Swarm or k8s but the steps are not that complicated. The very first thing I’d do is to replicate the cluster of services from the docker-compose.yml file and bring them up with the correct configuration to make sure they can talk to each other. It’s probably wise to try this with a fresh installation first. This would likely entail porting your custom config files over to the new ones (or vice-versa, porting the changes for v10 to your config).
This would also include running snuba bootstrap and snuba migrate commands so it can create necessary Kafka topics and Clickhouse tables. The details of this command is inside the install.sh file.
After that, upgrade the Postgres instance to 9.6 with the data migration. install.sh uses the tianon/postgres-upgrade:9.5-to-9.6 image for this but you can do it however you prefer.
Finally, you’d need to run sentry upgrade on the sentry image with all services connected and volumes mounted so the event data can be safely migrated to Clickhouse. If you have too many old events or large events, you may need to tune some settings in Kafka and rerun this step so make sure you have a back up before proceeding.
After this step, you should have a perfectly running, migrated Sentry 10 instance and any future upgrades should be easier.
Hope this helps. Apologies for the late response and happy to answer any questions you may have.
Has anyone successfully gotten v10 to work on docker swarm? I for the life of me cannot get it to work. The backend services just do not seem to talk to each other, and I don’t understand what any of the installation and bootstrapping stuff does, or why it is necessary. None of this seems to follow good docker practices, unless I am misunderstanding something. I don’t understand why this requires custom images. It should just be using docker secrets and configurations.
The only “custom” images we have are the cron-based cleanup services. This is because docker-compose does not support native scheduled runs, unlike Kubernetes for instance.
This sounds like a networking or configuration issue.
It serves two purposes:
To be able to get Sentry and its satellite services up and running, there are some migrations you need to run, both at the beginning and after every upgrade. The script automates these so people don’t have to run them manually which is quite error prone (proven by past experiences). The automation also allows us to ensure that our onpremise installation always works even with its rapidly changing components.
It takes care of ensuring some minimum requirements are met, handles upgrades from various older versions seamlessly.
Any reason why you are trying swarm, which AFAIK is deprecated, instead of Kubernetes for instance? Or well, obviously, https://sentry.io itself?