Sentry worker dying

Currently setting up sentry in kubernetes. Currently trying to run the sentry run worker process in a pod that has a memory and cpu limit of 3GB and 1 core. This is the error message i am getting:

09:17:00 [ERROR] multiprocessing: Process 'Worker-77' pid:90 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-76' pid:89 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-75' pid:88 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-74' pid:87 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-73' pid:86 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-72' pid:85 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-71' pid:84 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-70' pid:83 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-69' pid:82 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-68' pid:81 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-67' pid:80 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-66' pid:79 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-65' pid:78 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-64' pid:77 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-63' pid:76 exited with 'signal 9 (SIGKILL)'
09:17:00 [ERROR] multiprocessing: Process 'Worker-62' pid:75 exited with 'signal 9 (SIGKILL)'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 206, in start
    self.blueprint.start(self)
  File "/usr/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/usr/local/lib/python2.7/site-packages/celery/bootsteps.py", line 374, in start
    return self.obj.start()
  File "/usr/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 280, in start
    blueprint.start(self)
  File "/usr/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/usr/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 884, in start
    c.loop(*c.loop_args())
  File "/usr/local/lib/python2.7/site-packages/celery/worker/loops.py", line 48, in asynloop
    raise WorkerLostError('Could not start worker processes')
WorkerLostError: Could not start worker processes
09:17:00 [ERROR] celery.worker: Unrecoverable error: WorkerLostError('Could not start worker processes',)
09:17:06 [INFO] sentry.bgtasks: bgtask.stop (task_name=u'sentry.bgtasks.clean_dsymcache:clean_dsymcache')
09:17:06 [INFO] sentry.bgtasks: bgtask.stop (task_name=u'sentry.bgtasks.clean_releasefilecache:clean_releasefilecache')

Sentry version is 20.7.2.

It looks like something external, such as oomkiller, terminates your
worker with SIGKILL.

Yes OOM is killing the workers because they use 4 GB of Ram so this seems a tad bit excessive…but that shouldnt be the default state.

We did not set up the OOM killer, so we can’t fix the broken default state. It is not part of any docker container. From a quick google search I think you want to pass --oom-kill-disable to docker run somehow.

Also please stop replying to multiple old threads about your problem, it’s sufficient to open up one thread and wait a bit for a response.

The upper bounds of this memory was deliberately set because i have quite strict resource restrictions. Cant do anything about the OOM Killer sadly. What fascinates me is that by default, the worker process takes so many resources and just dies when limited. I mean this process alone has almost double the resources that are advised in the onpremise repo…

It seems that your celery worker attempts to spawn a large amount of subprocesses, or at least a number that seems way too high for a 3GB RAM machine. sentry run worker infers this process count from the number of CPUs it detects, try something like sentry run worker -c 1 perhaps.

We do say the minimum requirement is 2.4GB, but I think this might have been measured on a single-core machine or something like that. Might be worth revisiting or adding a disclaimer that RAM usage depends on CPU count, if the above suggestion works.

1 Like

Above suggestions works perfectly fine! No crashes whatsoever over the weekend and a reduced RAM usage to ~150 MB. Thanks a lot!