Celery error: connection already closed


#1

Today morning we’ve got some trouble with celery worker:

Background workers haven't checked in recently. It seems that you have a backlog of 8532 tasks. Either your workers aren't running or you need more capacity.

$ celery --version
3.1.18 (Cipater)
$ sentry --version
sentry, version 8.20.0

Redis 3.2.11
Postgresql 9.5

Sentry runned at docker container, configuration here:
(Sentry slack integration error)

Main celery error:
InterfaceError: connection already closed

After sentry worker proccess restart - all became good.

How i start sentry:
$sentry run worker
$sentry run cron
$sentry run web

What’s wrong?

Update 1 (Example of full Traceback):

11:55:21 [ERROR] multiprocessing: Process Worker-18120
11:55:21 [ERROR] multiprocessing: Process 'Worker-18120' pid:31878 exited with 'exitcode 1'
 Traceback (most recent call last):
   File "/usr/lib64/python2.7/site-packages/billiard/process.py", line 292, in _bootstrap
          self.run()
   File "/usr/lib64/python2.7/site-packages/billiard/pool.py", line 292, in run
     self.after_fork()
   File "/usr/lib64/python2.7/site-packages/billiard/pool.py", line 395, in after_fork
     self.initializer(*self.initargs)
   File "/usr/lib/python2.7/site-packages/celery/concurrency/prefork.py", line 82, in process_initializer
     signals.worker_process_init.send(sender=None)
   File "/usr/lib/python2.7/site-packages/celery/utils/dispatch/signal.py", line 166, in send
     response = receiver(signal=self, sender=sender, **named)
   File "/usr/lib/python2.7/site-packages/celery/fixups/django.py", line 208, in on_worker_process_init
     _maybe_close_fd(c.connection)
   File "/usr/lib/python2.7/site-packages/celery/fixups/django.py", line 30, in _maybe_close_fd
     os.close(fh.fileno())
 InterfaceError: connection already closed

#2

Today we catch this one more time.
Our solution was - to restart sentry cron process.

I’m still haven’t idea - what’s wrong with celery.
Connection to Redis in docker swarm?
Connection to Postgres?

Maybe need to swap Redis <-> RabbitMQ ?


#3

We noticed same behavior at own project.
Looks like kombu error.
We diging at this side now.
:space_invader: