Sentry worker stuck on a task

Hello,

I’ve been running the self-hosted version of sentry without problem for some time now (~6 months) but today, all of a sudden it seems to be stuck on a task and I’m not really sure why.

The dashboard is telling me that “Background workers haven’t checked in recently. It seems that you have a backlog of 85 tasks. Either your workers aren’t running or you need more capacity.” and indeed when I open the worker’s log I get the same exception every two seconds:

[WARNING] sentry.lang.javascript.processor: Disabling sources to gc.kis.v2.scr.kaspersky-labs.com for 300s
Traceback (most recent call last):
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/sentry/lang/javascript/processor.py", line 372, in fetch_file
    stream=True,
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/requests/sessions.py", line 501, in get
    return self.request('GET', url, **kwargs)
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/raven/breadcrumbs.py", line 297, in send
    resp = real_send(self, request, *args, **kwargs)
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/requests/sessions.py", line 609, in send
    r = adapter.send(request, **kwargs)
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/sentry/http.py", line 108, in send
    return super(BlacklistAdapter, self).send(request, *args, **kwargs)
  File "/home/studiogdo/.venvs/sentry/local/lib/python2.7/site-packages/requests/adapters.py", line 479, in send
    raise ConnectTimeout(e, request=request)
ConnectTimeout: HTTPSConnectionPool(host='gc.kis.v2.scr.kaspersky-labs.com', port=443): Max retries exceeded with url: /572352B1-2AEF-1C4D-9D4A-ECC772880311/main.js (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7fbe202ac790>, 'Connection to gc.kis.v2.scr.kaspersky-labs.com timed out. (connect timeout=2)'))

From what I understand, it should stop trying to fetch the sourcemap for some time but in this case it’s retrying forever and prevents the remaining tasks from being processed.

  • Sentry 8.11.0
  • Python 2.7.6
  • raven-js: 3.6.0

pip freeze:

amqp==1.4.9
anyjson==0.3.3
BeautifulSoup==3.2.1
billiard==3.3.0.23
boto3==1.4.2
botocore==1.4.82
celery==3.1.18
cffi==1.9.1
click==6.6
contextlib2==0.5.4
cryptography==1.6
cssselect==1.0.0
cssutils==0.9.10
Django==1.6.11
django-bitfield==1.7.1
django-crispy-forms==1.4.0
django-debug-toolbar==1.3.2
django-jsonfield==0.9.13
django-paging==0.2.5
django-picklefield==0.3.2
django-recaptcha==1.0.5
django-social-auth==0.7.28
django-sudo==2.1.0
django-templatetag-sugar==1.0
djangorestframework==2.3.14
djrill==2.0
docutils==0.12
email-reply-parser==0.2.0
enum34==1.1.6
exam==0.10.6
futures==3.0.5
hiredis==0.1.6
honcho==0.7.1
httplib2==0.9.2
idna==2.1
ipaddr==2.1.11
ipaddress==1.0.17
jmespath==0.9.0
kombu==3.0.35
libsourcemap==0.5.0
lxml==3.6.4
mock==1.0.1
msgpack-python==0.4.7
ndg-httpsclient==0.4.2
oauth2==1.9.0.post1
percy==0.4.1
petname==1.7
Pillow==3.2.0
progressbar==2.3
progressbar2==3.10.1
psycopg2==2.6.2
py==1.4.31
pyasn1==0.1.9
pycparser==2.17
pyOpenSSL==16.2.0
pytest==2.6.4
pytest-django==2.9.1
pytest-html==1.9.0
python-dateutil==2.6.0
python-memcached==1.58
python-openid==2.2.5
python-u2flib-server==4.0.1
python-utils==2.0.0
pytz==2016.7
PyYAML==3.11
qrcode==5.3
raven==5.32.0
rb==1.6
redis==2.10.5
requests==2.12.3
s3transfer==0.1.9
selenium==3.0.0b3
sentry==8.11.0
setproctitle==1.1.10
simplejson==3.8.2
six==1.10.0
South==1.0.1
sqlparse==0.2.2
statsd==3.1
structlog==16.1.0
symsynd==1.3.0
toronado==0.0.11
ua-parser==0.7.2
urllib3==1.16
uWSGI==2.0.14

I tried restarting the worker but it didn’t change anything.
Do you have any idea what could be causing this issue? Is there a way to skip this task in the meantime?

Thank you.

Well, after 2 hours the issue resolved itself without changing anything.
All jobs started at the same time and everything is fine now.

I still have no idea why the worker got stuck on the same task but I’d be happy to provide more information if need be.