Hi.
We’re using version 20.7.2. And we faced a situation, when postgres database grows infinitely.
Cleanup cron set up to start every 24h and SENTRY_EVENT_RETENTION_DAYS=30, but nodestore_node table size keeps growing and growing, about 9Gb a day.
Here’s nodestore_node table size comparing with the others, and volume sizes of another services.
table_schema | table_name | row_estimate | total | index | toast | table
--------------------+-----------------------------------------+--------------+------------+------------+------------+------------
public | nodestore_node | 4.35109e+07 | 213 GB | 7849 MB | 202 GB | 4107 MB
public | sentry_releasefile | 825410 | 370 MB | 239 MB | 8192 bytes | 131 MB
public | sentry_grouphash | 12398 | 357 MB | 225 MB | | 132 MB
public | sentry_file | 884136 | 332 MB | 201 MB | 8192 bytes | 131 MB
public | sentry_fileblobindex | 933114 | 158 MB | 102 MB | | 55 MB
public | sentry_commitfilechange | 119278 | 41 MB | 24 MB | | 17 MB
public | sentry_fileblob | 76091 | 32 MB | 21 MB | 8192 bytes | 11 MB
sentry-postgres 1 231.5GB
sentry-zookeeper 1 406kB
sentry-kafka 1 45.41GB
sentry-redis 1 3.187MB
sentry-clickhouse 1 27.58GB
sentry-data 6 40.58GB
Last autovauum of nodestore_node was 2020-10-23 (before upgrade to 20.7.2).
relname | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze
---------------------------------+-------------+-------------------------------+-------------------------------+-------------------------------
nodestore_node | | 2020-10-23 20:27:16.052523+00 | 2020-11-02 16:55:38.190337+00 | 2020-11-18 08:08:10.848068+00
Last time we did VACUUM FULL it freed up about 70Gb, when the whole PG volume was about 170Gb. But then, as you can see, it became even bigger.
Could you tell us, please, how to understand if it’s normal behaviour or not? And what we can do if it isn’t? I read that vacuuming db periodically is not really good option.
P.S. In Nginx logs we see this:
{ “@timestamp”: “2020-11-18T12:00:43+00:00”, “remote_addr”: “IP_ADDRESS”, “body_bytes_sent”: “41”, “server_name”: “SERVER_ADDRESS”, “dest_port”: “PORT”, “host”: “SERVER_ADDRESS”, “http_x_header”: “”, “query_string”: “”, “status”: “200”, “uri_path”: “/api/12/envelope/”, “request_method”: “POST”, “http_referrer”: “”, “content_type”: “application/json”, “http_x_forwarded_for”: “”, “upstream_response_time”: “0.002”, “upstream_connect_time”: “0.001”, “upstream_header_time”: “0.002”, “request_time”: “0.006”, “request_id”: “2b295c63627d3e77d963a978460ab47e”, “http_user_agent”: “” }
It spams for about 20 times a second (with different project IDs). And it had been denied (but we had no problems with recieving events/issues), before we upgraded to 20.7.2 and added relay service. Can it be related to our problem?