No events / issue details after upgrade from 9.1.2 to 20.9/latest

I’m struggling with following situation:
Currently we use sentry 9.0.0 and want to upgrade to 20.9. I managed to upgrade from 9.0.0 to 9.1.2, the upgrade went through quite fast and easily, with no errors. 9.1.2 is running without any flaws.
I set up a sentry 20.9.0 instance in our bare-metal Kubernetes Cluster (converted the docker-compose.yml with “kompose” and did a lot of modifications to get it up and running) as a new Installation. This seems to work perfectly, everything seems to work just fine, I can fire an sample issue and we can log exceptions of our applications in the new instance.
My steps when I try to upgrade from 9.1.2 to 20.9.0:
- wipe the postgres database (external postgres database server) and restore the 9.1.2 database
- wipe all volumes (pvc on our Kubernetes cluster)
- deploy the sentry 20.9.0 pods (same as the docker-compose created containers)
- after all pods are running I ensure that the Kafka Pod is up and running - if it is running:
- start “snuba bootstrap --force” inside the snuba-api pod, followed by “snuba migrations migrate” (not sure if the migrations migrate step is needed)
- the output of “snuba bootstrap --force” tells me that the kafka topics were created and some migrations have been run
- then I start “sentry upgrade --noinput” inside the sentry-web pod:

    *Faking migration for contenttypes.0001_initial*
*    Faking migration for contenttypes.0002_remove_content_type_name*
*    Faking migration for auth.0001_initial*
*    Faking migration for auth.0002_alter_permission_name_max_length*
*    Faking migration for auth.0003_alter_user_email_max_length*
*    Faking migration for auth.0004_alter_user_username_opts*
*    Faking migration for auth.0005_alter_user_last_login_null*
*    Faking migration for auth.0006_require_contenttypes_0002*
*    Faking migration for sites.0001_initial*
*    Faking migration for admin.0001_initial*
*    Faking migration for sessions.0001_initial*
*    Operations to perform:*
*      Apply all migrations: admin, auth, contenttypes, jira_ac, nodestore, sentry, sessions, sites, social_auth*
*    Running migrations:*
*      Applying admin.0002_logentry_remove_auto_add... OK*
*      Applying auth.0007_alter_validators_add_error_messages... OK*
*      Applying auth.0008_alter_user_username_max_length... OK*
*      Applying sentry.0002_912_to_recent... OK*
*      Applying sentry.0003_auto_20191022_0122... OK*
*      Applying sentry.0004_bitfieldtestmodel_blankjsonfieldtestmodel_callabledefaultmodel_jsonfieldtestmodel_jsonfieldwithdefau... OK*
*      Applying sentry.0005_fix_content_types... OK*
*      Applying sentry.0006_sentryapp_date_published... OK*
*      Applying sentry.0007_auto_20191029_0131... OK*
*      Applying sentry.0008_auto_20191030_0016... OK*
*      Applying sentry.0009_auto_20191101_1608... OK*
*      Applying sentry.0010_auto_20191104_1641... OK*
*      Applying sentry.0011_remove_pagerdutyservice_service_id_from_state... OK*
*      Applying sentry.0012_remove_pagerdutyservice_service_id... OK*
*      Applying sentry.0013_auto_20191111_1829... OK*
*      Applying sentry.0014_delete_sentryappwebhookerror... OK*
*      Applying sentry.0015_delete_sentryappwebhookerror_db... OK*
*      Applying sentry.0016_delete_alert_rule_deprecated_fields... OK*
*      Applying sentry.0017_incident_aggregation... OK*
*      Applying sentry.0018_discoversavedquery_version... OK*
*      Applying sentry.0019_auto_20191114_2040... OK*
*      Applying sentry.0020_auto_20191125_1420... OK*
*      Applying sentry.0021_auto_20191203_1803... OK*
*      Applying sentry.0021_auto_20191202_1716... OK*
*      Applying sentry.0022_merge... OK*
*      Applying sentry.0023_hide_environment_none_20191126... OK*
*      Applying sentry.0024_auto_20191230_2052...Events to process: 291181*
  • during the upgrade I checked the output of:
  • clickhouse-server.log: (just a small portion I cut out)

    2020.10.12 06:28:29.472705 [ 49 ] {} SystemLog (system.trace_log): Flushing system log

  • 2020.10.12 06:28:29.473670 [ 49 ] {} DiskLocal: Reserving 1.00 MiB on disk default, having unreserved 764.17 GiB.*

  • 2020.10.12 06:28:29.684439 [ 49 ] {} system.trace_log: Renaming temporary part tmp_insert_202010_379_379_0 to 202010_379_379_0.*

  • 2020.10.12 06:28:29.812088 [ 87 ] {} default.sentry_local: Renaming temporary part tmp_merge_20191230-90_1006_1021_3 to 20191230-90_1006_1021_3.*

  • 2020.10.12 06:28:29.815816 [ 87 ] {} default.sentry_local (MergerMutator): Merged 6 parts: from 20191230-90_1006_1016_2 to 20191230-90_1021_1021_0*

  • 2020.10.12 06:28:29.965497 [ 73 ] {} HTTPHandler-factory: HTTP Request for HTTPHandler-factory. Method: POST, Address:, User-Agent: none, Content Type: , Transfer*

  • Encoding: chunked*
  • 2020.10.12 06:28:29.965614 [ 73 ] {} HTTPHandler: Request URI: /?load_balancing=in_order&insert_distributed_sync=1&query=INSERT+INTO+default.sentry_local+FORMAT+JSONEachRow*

  • 2020.10.12 06:28:29.969915 [ 73 ] {f8922bc5-db6b-4ddd-a5dd-45ca123c51f4} executeQuery: (from INSERT INTO default.sentry_local FORMAT JSONEachRow *

  • 2020.10.12 06:28:29.970097 [ 73 ] {f8922bc5-db6b-4ddd-a5dd-45ca123c51f4} AccessRightsContext (default): Access granted: INSERT(event_id, project_id, group_id, timestamp, deleted, re*

  • tention_days, platform, message, primary_hash, received, search_message, title, location, user_id, username, email, ip_address, geo_country_code, geo_region, geo_city, sdk_name, sdk_version*

  • , type, version, offset, partition, message_timestamp, os_build, os_kernel_version, device_name, device_brand, device_locale, device_uuid, device_model_id, device_arch, device_battery_level*

  • , device_orientation, device_simulator, device_online, device_charging, level, logger, server_name, transaction, environment, sentry:release, sentry:dist, sentry:user, site, url, app_*

  • device, device, device_family, runtime, runtime_name, browser, browser_name, os, os_name, os_rooted, tags.key, tags.value, _tags_flattened, contexts.key, contexts.value, http_method*

  • , http_referer, exception_stacks.type, exception_stacks.value, exception_stacks.mechanism_type, exception_stacks.mechanism_handled, exception_frames.abs_path, `*

  • lename, exception_frames.package, exception_frames.module, exception_frames.function, exception_frames.in_app, exception_frames.colno, exception_frames.lineno, exception_frame*

  • s.stack_level, culprit, sdk_integrations,, modules.version`) ON default.sentry_local*

  • 2020.10.12 06:28:29.979567 [ 73 ] {f8922bc5-db6b-4ddd-a5dd-45ca123c51f4} DiskLocal: Reserving 1.14 MiB on disk default, having unreserved 764.17 GiB*

snuba consumer:

*| 2020-10-12 06:33:41,113 Completed processing <Batch: 26 messages, open for 2.64 seconds>.                                                                                                 │*
  • │ 2020-10-12 06:33:48,709 Completed processing <Batch: 22 messages, open for 7.59 seconds>. │*

  • │ 2020-10-12 06:33:51,143 Completed processing <Batch: 55 messages, open for 2.43 seconds>. │*

  • │ 2020-10-12 06:33:53,863 Completed processing <Batch: 19 messages, open for 2.72 seconds>. │*

  • │ 2020-10-12 06:33:57,664 Completed processing <Batch: 26 messages, open for 3.79 seconds>. │*

  • │ 2020-10-12 06:34:00,271 Completed processing <Batch: 26 messages, open for 2.51 seconds>. │*

  • │ 2020-10-12 06:34:02,413 Completed processing <Batch: 16 messages, open for 2.14 seconds>. │*

  • │ 2020-10-12 06:34:04,756 Completed processing <Batch: 22 messages, open for 2.34 seconds>. │*

  • │ 2020-10-12 06:34:08,826 Completed processing <Batch: 21 messages, open for 4.06 seconds>. │*

  • │ 2020-10-12 06:34:13,036 Completed processing <Batch: 32 messages, open for 4.21 seconds>. │*

  • │ 2020-10-12 06:34:16,709 Completed processing <Batch: 36 messages, open for 3.67 seconds>. │*

  • │ 2020-10-12 06:34:20,354 Completed processing <Batch: 35 messages, open for 3.64 seconds>. │*

  • │ 2020-10-12 06:34:22,662 Completed processing <Batch: 28 messages, open for 2.31 seconds>.*

  • ingest-customer:

    │ 06:25:30 [INFO] batching-kafka-consumer: Flushing 1 items (from {(u’ingest-events’, 0): [45L, 45L]}): forced:False size:False time:True │

  • │ 06:25:31 [INFO] batching-kafka-consumer: Worker flush took 95ms │*

  • │ 06:25:53 [INFO] batching-kafka-consumer: Flushing 1 items (from {(u’ingest-events’, 0): [46L, 46L]}): forced:False size:False time:True │*

  • │ 06:25:53 [INFO] batching-kafka-consumer: Worker flush took 56ms │*

  • │ 06:26:24 [INFO] batching-kafka-consumer: Flushing 1 items (from {(u’ingest-events’, 0): [47L, 47L]}): forced:False size:False time:True │*

  • │ 06:26:24 [INFO] batching-kafka-consumer: Worker flush took 10ms │*

  • │ 06:27:23 [INFO] batching-kafka-consumer: Flushing 1 items (from {(u’ingest-events’, 0): [48L, 48L]}): forced:False size:False time:True │*

  • │ 06:27:23 [INFO] batching-kafka-consumer: Worker flush took 73ms*

and (this one looks suspicious to me bc of the missing key id error msg):
relay (switched to debug mode TRACE):

*│ 2020-10-12T06:26:23Z [relay_server::actors::project] DEBUG: project 99a1f406149c47a5b8fd97c82c95ea1d state request amended                                                                │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::project] DEBUG: project state 99a1f406149c47a5b8fd97c82c95ea1d updated │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::project] DEBUG: project 99a1f406149c47a5b8fd97c82c95ea1d state requested │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::project] DEBUG: project state 99a1f406149c47a5b8fd97c82c95ea1d updated │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::events] TRACE: processing json event │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::events] ERROR: project state for key 99a1f406149c47a5b8fd97c82c95ea1d is missing key id │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::events] TRACE: sending envelope to kafka │*
  • │ 2020-10-12T06:26:23Z [relay_server::actors::store] TRACE: Sending event item of envelope to kafka*

when the upgrade is finished (lasts about 4 hours), I see all my imported projects and issues. But if I try to open the details of an issue I receive this error:

relay log:

│ 2020-10-12T13:09:32Z [relay_server::actors::project] DEBUG: project 99a1f406149c47a5b8fd97c82c95ea1d state requested │
│ 2020-10-12T13:09:32Z [relay_server::actors::events] TRACE: queued event │
│ 2020-10-12T13:09:32Z [relay_server::actors::project] DEBUG: project 99a1f406149c47a5b8fd97c82c95ea1d state request amended │
│ 2020-10-12T13:09:32Z [relay_server::actors::project_upstream] DEBUG: updating project states for 1/1 projects (attempt 1) │
│ 2020-10-12T13:09:32Z [relay_server::actors::project_upstream] DEBUG: sending request of size 1 │
│ 2020-10-12T13:09:32Z [relay_server::actors::project] DEBUG: project 99a1f406149c47a5b8fd97c82c95ea1d state request amended │
│ 2020-10-12T13:09:32Z [relay_server::actors::events] TRACE: queued event │
│ 2020-10-12T13:09:32Z [relay_server::actors::project] DEBUG: project 99a1f406149c47a5b8fd97c82c95ea1d state request amended │
│ 2020-10-12T13:09:32Z [relay_server::actors::project] DEBUG: project state 99a1f406149c47a5b8fd97c82c95ea1d updated │
│ 2020-10-12T13:09:32Z [relay_server::actors::events] DEBUG: dropped event: event submission rejected with reason: ProjectId │
│ 2020-10-12T13:09:32Z [relay_server::actors::outcome::processing] TRACE: handling outcome │
│ 2020-10-12T13:09:32Z [relay_server::actors::outcome::processing] TRACE: Tracking kafka outcome: TrackRawOutcome { timestamp: “2020-10-12T13:09:32.470154Z”, org_id: None, project_id: ProjectId(1), key_id: None, outcome: 3, reason: Some(“project_id”), event_id: Some(EventId(6adafc12-9df3-47aa-9ea5-8e8be7c35c3f)), remote_addr: Some(“”), source: None }


any help would be appreciated… :slightly_smiling_face:

1 Like

I have same problem after upgrade from 9.1.2 to 20.9, but i deploy sentry in docker from onpromise.

Do you have a debug view in your app to trigger an error in sentry? If so, can you trigger the error and see if you are getting any exceptions?

I have faced similar problem while upgrading from 9.0 to I have added debug view like this.

def sentry_debug_view(request):
    raise this_is_a_test_exception_for_sentry  # noqa

When I triggered this view from the browser, there were some exceptions due to the CSRF token which was resolved as mentioned here Issue with CSRF token after sentry upgrade

WIthout some error logs from Sentry web instances themselves, it is hard to speculate on this issue TBH. Can you share logs from them too?

stdout from container sentry_onpremise_web_1:

19:24:41 [INFO] sentry.superuser: superuser.request (user_id=1 url=u’’ method=u’GET’ ip_address=u’’)
19:24:42 [INFO] sentry.superuser: superuser.request (user_id=1 url=u'’ method=u’GET’ ip_address=u’’)
19:24:42 [WARNING] django.request: Not Found: /api/0/issues/2875/events/latest/ (status_code=404 request=<WSGIRequest: GET u’/api/0/issues/2875/events/latest/’>)

and stdout from container sentry_onpremise_nginx_1: - - [12/Oct/2020:19:24:42 +0000] “GET /api/0/issues/2875/ HTTP/1.1” 200 2179 “” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:81.0) Gecko/20100101 Firefox/81.0” “-” - - [12/Oct/2020:19:24:42 +0000] “GET /api/0/issues/2875/events/latest/ HTTP/1.1” 404 28 “” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:81.0) Gecko/20100101 Firefox/81.0” “-”

I now invoke error and had seen the exception:

but time in exception isn’t correct.
time in docker is correct.

@mitmafiy maybe you should open another topic for your problem - since I’ve provided a lot of my logs I don’t want to get anything mixed up here…

@BYK - here is the log of the sentry-web pod/container when I try to open the details of an issue:

*│ 10:23:44 [INFO] sentry.superuser: superuser.request (user_id=10 url=u' │*
*│ 10:23:54 [INFO] sentry.superuser: superuser.request (user_id=10 url=u'' method=u'GET' ip_address=u'')                               │*
*│ 10:23:54 [INFO] sentry.superuser: superuser.request (user_id=10 url=u'' method=u'GET' ip_address=u'') │*
*│ 10:23:58 [INFO] sentry.superuser: superuser.request (user_id=10 url=u'' method=u'GET' ip_address=u'')                 │*
*│ 10:23:58 [WARNING] django.request: Not Found: /api/0/issues/9993/events/latest/ (status_code=404 request=<WSGIRequest: GET u'/api/0/issues/9993/events/latest/'>)*  

today I set an internal dns entry to avoid to open the url “nginx.sentry.svc…” since this is the Kubernetes DNS URL of the nginx Service and to avoid any problems with csrf

@sraisl ok, ok sorry)
I was hoped it would be faster for us. the problems seem to be the same.

I just tried to create a new project and created an sample issue in there. Seems that newly created issues can be opened.

(but I also get 404 on the /commiters url)

Looks like something went wrong during the migration but it is hard to pinpoint what. Do you still have your old back ups and install/upgrade logs?

Thanks for keeping an eye on this. We have our 9.0.0 setup still running on a separate VM, so nothing is lost. I would wipe the new setup and start all over with the migration. Could you tell me which logs do I have to watch during the migration?

The most important one would be the output of sentry upgrade command. That said during that upgrade, almost all other services need to be up and running (kafka, zookeeper, clickhouse, snuba, post-process-forwarder, redis, and postgres) and any issue with those would cause missing or corrupted events. You’ll need to go to 9.1.2 first as usual.

If you have time, I’d wait until 15th (Thursday) for the latest 20.10.0 release and use that.

1 Like

Still no luck for me - tried a clean setup with 20.10.1, no errors in sentry web api log during migration / upgrade from the 9.1.2 database.
But I’ve got a lot of errors in the postgres log during the upgrade:

ERROR:  null value in column "first_seen" violates not-null constraint                                                                                                                                   
DETAIL:  Failing row contains (278, dd460301-68f3-48e1-864c-56f246392129, 7jKvPh4lQtT1Btys9IFNj0o8m4o8lQ9UiDMW2qNpIsU, null, null, t).                                                                   
STATEMENT:  INSERT INTO "sentry_relay" ("relay_id", "public_key", "first_seen", "last_seen", "is_internal") VALUES ('dd460301-68f3-48e1-864c-56f246392129', '7jKvPh4lQtT1Btys9IFNj0o8m4o8lQ9UiDMW2qNpIsU ....


STATEMENT:  SELECT "sentry_exporteddata"."id", "sentry_exporteddata"."organization_id", "sentry_exporteddata"."user_id", "sentry_exporteddata"."file_id", "sentry_exporteddata"."date_added", "sentry_ex 
ERROR:  relation "sentry_exporteddata" does not exist at character 332  

any ideas on this one? Something wrong with my database dump?
I’ll check the Kafka instance a little later, since the volume size of the Kafka pod increased a lot ( ~12 GB) it seems that the events have been committed to Kafka.

edit: checked Kafka right now, seems that events got stored. What I tried (in the Kafka Pod):
kafka-console-consumer --topic events --from-beginning --bootstrap-server localhost:9092

and the output was a lot of (cut the last few lines out):

[2,"insert",{"organization_id":1,"platform":null,"project_id":32,"event_id":"cf90b7c9e81b4e758991a7e4913f1d1a","message":"Error while loading address from pipe.line. ID=d1e0805e-7b54-4220-b0f1-62714d717887, technicalLocation=8802888, sapNumber=000014912030 Error while loading address from pipe.line. ID=d1e0805e-7b54-4220-b0f1-62714d717887, technicalLoc...","group_id":2879,"data":{"extra":{"Sentry-Threadname":"scheduling-2"},"stacktrace":{"frames":[{"function":"invoke","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"java.lang.reflect.Method","in_app":false,"platform":null,"lineno":567,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null},{"function":"invoke","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"jdk.internal.reflect.DelegatingMethodAccessorImpl","in_app":false,"platform":null,"lineno":43,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null},{"function":"invoke","symbol_addr":null,"abs_path":null,"errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"jdk.internal.reflect.GeneratedMethodAccessor195","in_app":false,"platform":null,"raw_function":null,"colno":null,"trust":null,"filename":null,"data":null,"context_line":null,"lineno":null},{"function":"dispatchNotifications","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"","in_app":true,"platform":null,"lineno":27,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null},{"function":"dispatchNotifications","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"","in_app":true,"platform":null,"lineno":57,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null},{"function":"forEach","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"java.util.ArrayList","in_app":false,"platform":null,"lineno":1540,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null},{"function":"lambda$dispatchNotifications$0","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"","in_app":true,"platform":null,"lineno":60,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null},{"function":"enrichAndDispatchNotification","symbol_addr":null,"abs_path":"","errors":null,"pre_context":null,"post_context":null,"vars":null,"package":null,"instruction_addr":null,"symbol":null,"image_addr":null,"module":"","in_app":true,"platform":null,"lineno":86,"colno":null,"trust":null,"filename":"","data":null,"context_line":null,"raw_function":null}]},"_meta":{"key_id":{"":{"val":32,"err":[["invalid_data",{"reason":"expected a string"}]]}}},"tags":[["environment","prod"],["logger",""],["sentry:release","28.0"],["server_name",""],["level","error"]],"timestamp":1571733297.0,"received":1571733297.0,"project":32,"version":"6","logentry":{"message":"Error while loading address from pipe.line. ID={}, technicalLocation={}, sapNumber={}","params":["d1e0805e-7b54-4220-b0f1-62714d717887","8802888","000014912030"],"formatted":"Error while loading address from pipe.line. ID=d1e0805e-7b54-4220-b0f1-62714d717887, technicalLocation=8802888, sapNumber=000014912030"},"fingerprint":["{{ default }}"],"key_id":null,"metadata":{"title":"Error while loading address from pipe.line. ID=d1e0805e-7b54-4220-b0f1-62714d717887, technicalLoc..."},"type":"default","sdk":{"version":"1.7.22-2a267","name":"sentry-java","packages":null,"integrations":["logback"]}},"primary_hash":"07b9c7c3dc2d017b8fecc82b4c96ca83","retention_days":365,"datetime":"2019-10-22T08:34:57.000000Z"},{"is_new_group_environment":false,"is_new":false,"skip_consume":true,"is_regression":false}]

Processed a total of 1268 messages

@BYK seems to be the same problem as After upgrade from 9.1.2 to 10.0/20.9/latest no events details - should I share any more infos or logs?

If you think they are the same, I’d try with one of the nightly builds to see if it fixes the problem as you can see there, we have merged a fix.

Yes! Finally :smiley: Thanks a lot!
I tried with the nightly-py3 image today and everything is working as accepted!
I guess this will get “final” in the next release in November?

1 Like

Glad it was resolved! Big thanks to @lynnagara who made the fix and @mitmafiy who helped us identify the issue in the first place :slight_smile:

Yup, November 16th to be exact.