Postgres is growing rapidly - Manual cleanup

lukas · April 19, 2021, 8:06am

Hello,

I’m using Sentry on-premises and it’s working well.

But Sentry is constantly filling up my disk. I checked the location that is taking all the space and it’s /var/lib/docker/volumes/sentry-postgres/_data/base/*
It’s already over 50G and it’s growing all the time. The files in there are created by user “systemd-coredump”, but I guess these aren’t just dumps but just the normal entries generated by Sentry?

Is there any way to clean up the space or setup Sentry not to generate that much data in the first place? Or is it even normal that Sentry will take up so much space and I just have to buy more and more disk space?

Thank you very much

chrisvdb · May 6, 2021, 10:28am

I have the same issue. Please let me know if you found a solution in the meantime!

untitaker · May 6, 2021, 10:43am

@chrisvdb Please refer to

or any search on “postgres” or “nodestore”.

However, if the files are actually created by systemd-coredump (as described @lukas ) I believe your database may be crashing regularly?

chrisvdb · May 11, 2021, 2:31pm

Thank you for the information… appreciated!

Just checked my installation, and it’s the sentry-postgres docker volume that’s continuously growing.

I’m wondering if I could do the following:

install Sentry
configure Sentry
backup the volume
run Sentry until volume > x
stop Sentry, restore the volume, restart Sentry
repeat

untitaker · May 11, 2021, 3:01pm

That will constantly reset most state within Sentry, not sure how useful the product becomes if e.g. all your issue triaging progress resets. I would look into which files within that volume actually take up space, or which tables.

chrisvdb · May 12, 2021, 10:25am

Yes, fully agreed that this would limit the usefulness… but for our use case it wouldn’t be that much of an issue. We normally address issues with hours or max a few days of popping up on Sentry. So, we don’t need the long issue history.

Agreed that emptying a table instead of completely resetting/wiping the whole database would definitely be preferable. But would need guidance on that from the Sentry team…

untitaker · May 12, 2021, 10:48am

For us to help you we need to know which postgres table is primarily responsible for the excessive disk usage. I can’t give definitive advice on how to figure that out, but a quick google search gives me e.g. this: https://wiki.postgresql.org/wiki/Disk_Usage

chrisvdb · May 12, 2021, 11:25am

I’ll try to see how I can connect to the database… and explore the schema. Might take a few days, but will report back here.

WilsonWsAuYeung · May 12, 2021, 8:36pm

I have the same issue… I’m at 50 GB a day now.
The only suggestion I see in those threads above

is to turn off organizations:performance-view
Also maybe switch our backends. Is there an option to switch our backends to something more scalable? Or at least move over bulk of data off postgres.
Context: Im using the helm chart here

I’ve also changed my retention to 30 Days. and job is now running every day.
Tried Vaccuum FULL to try to reclaim space. All of which didnt seem to have helped the issue.
If there are any suggestions here, would be appreciated.

chrisvdb · May 14, 2021, 6:35am

Ok, have succeeded in connecting to the Postgres database.

One table is >99% of the database size: nodestore_node. It is currently 8,581,322 rows and 30G in size and continuously growing.

@untitaker, what more information would be useful?

lukas · May 14, 2021, 7:48am

Sorry for not responding for so long.

Since Sentry was not working correctly I had to delete the installation an reinstall it on a clean machine. This time I disabled the transactions (performance) by setting the traces-samplerate to 0 and it seems that this was the case for large amount of data that was stored.

But I still don’t know why the files are created by systemd-coredump, because my DB wasn’t crashing.
I’m still a little bit suprised about the amount of stored data by transactions/performance. I can remember that Sentry saved about 150k transactions which took about 60GB of storage. Is that a normal behaviour?

chrisvdb · May 17, 2021, 3:12am

Just checked, and the oldest items in the table have a 2021-02-16 timestamp despite the retention period being 1 week.

A few questions:

any idea why this would be the case?
can I safely delete all items older than 1w or would that result in an unstable system?
should the retention period affect performance traces as well?

untitaker · May 17, 2021, 8:34am

@chrisvdb I’m not aware of any Sentry option that would allow you to tweak event retention to 1 week. Yes you can delete any items from there, it will cause a little bit of instability but not a lot.

chrisvdb · May 17, 2021, 11:53am

I meant the autoresolve issues…

For others facing the same issue, I have done the following:

DELETE FROM public.nodestore_node WHERE “timestamp” < NOW() - INTERVAL ‘10 day’;
VACUUM FULL public.nodestore_node;

Disk space usage went from 30 GB to ~3 GB. Sentry seems to be functional still…

untitaker · May 17, 2021, 12:09pm

yes autoresolve can probably be set to 1 week, but we still allow you to view 90 days worth of event data and also un-resolve if the issue starts occurring again

WilsonWsAuYeung · June 3, 2021, 5:30pm

I’m currently still facing same issues…
Disk usage is growing at > ~50 GB a day.
So far, these are the suggestions I tried out.

Turning off organizations:performance-view (didn’t see much decrease or any at all)
Changing Retention days to 30. Job ran every day. (also didnt see much decrease)
DELETE FROM public.nodestore_node WHERE “timestamp” < NOW() - INTERVAL ‘30 day’;
VACUUM FULL public.nodestore_node; (This did reclaim space but disk space usage still insanely high so it only bought me time)
Switch our backends as per the other suggested thread. Is there an option to switch our backends to something more scalable? Or at least move over bulk of data off postgres.
Context: Im using the helm chart here (Havent tried this but doesnt seem like its in documentation or supported by the helm chart)

Is there any way to decrease the amount of data? or store it somewhere else?

WilsonWsAuYeung · June 18, 2021, 10:31pm

Are there any suggestions?

system · September 16, 2021, 10:32pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Postgres nodestore_node table 124gb On-Premise	3	7915	May 27, 2021
Unable to run "sentry cleanup" due to lack of disk space On-Premise	4	11583	December 7, 2016
Postgres DB growth On-Premise	5	4238	November 26, 2020
Postgres disk full, how configure sentry retention? On-Premise	5	2280	August 13, 2021
Nodestore_node size On-Premise	6	3190	March 23, 2021

Postgres is growing rapidly - Manual cleanup

Related topics