Database cleanup

Hi All,

I am new to Sentry. I have installed sentry in my office and it is working fine.

The problem is the sentry database (Postgres). It is growing daily. Now the database is occupied 34 GB.

I have put the cron entry to cleanup for every 15 Days

/opt/sentry/bin/sentry cleanup --days=15

Even though the database is not shrinking.

I could see one month old issue also but 15 days old issue enough for me.

Please give me your valuable suggestion to free up disk space.

Thanks in Advance.

1 Like

Postgres will not reclaim disk space without an offline operation, and to be honest, you’re going to need pretty significant storage if you have any reasonable throughput. Generally it shouldn’t grow much if your usage isnt growing and you’re running the cleanup continuously.

Hi,
Thanks for your valuable reply.

How can i reclaim the disk space with offline operation. Kindly send me the step by step.

It will will helpful for me.

Thanks in Advance.

With Postgres, you can try by running a VACUUM FULL. This will lock the entire database while this happens rendering it completely useless for potentially many many hours. This still doesn’t guarantee freeing up space if you legitimately have lots of data.

1 Like

I will also note, please please read documentation about this about Postgres. VACUUM FULL is not something that you want to run regularly and is usually an indication of a problem. Normal autovacuuming should be enough to keep things chill unless previously you weren’t running cleanup, and you just now ran it.

I use vacuumdb -U postgres -d postgres -v -f --analyze :joy:

Has anyone had experience partitioning tables in sentry?
It may be much simpler to drop old tables than make DELETE and then VACUUM FULL.

Some of our tables support using Django’s DATABASE_ROUTERS configuration to vertically partition tables off, but we do not support horizontal partitioning for any of the tables. Over time this will become less and less of a problem as high cardinality/high volume event data is fully removed from Postgres and put into other data services like Snuba.

Hi,
I am trying to use the postgres function of partitioning tables in the sentry database. To do this, I selected several tables, re-created them by adding a new field ‘day’, and then create partition using this field as the partitioning key:

CREATE TABLE sentry_commit ( id bigint NOT NULL, organization_id integer NOT NULL, repository_id integer NOT NULL, key character varying(64) NOT NULL, date_added timestamp with time zone NOT NULL, author_id bigint, message text, day date not null default now(), primary key (id, day), unique (repository_id, key, day), CONSTRAINT sentry_commit_organization_id_check CHECK ((organization_id >= 0)), CONSTRAINT sentry_commit_repository_id_check CHECK ((repository_id >= 0)) ) PARTITION BY RANGE (day);

CREATE TABLE IF NOT EXISTS sentry_commit_20201006 PARTITION OF sentry_commit (id DEFAULT nextval('sentry_commit_id_seq'::regclass)) FOR VALUES FROM ('2020-10-06') TO ('2020-10-07');

I have a problem with the data insertion that occurs when the sentry application is running. In the logs I see such error:

STATEMENT: INSERT INTO "sentry_commit" ("organization_id", "repository_id", "key", "date_added", "author_id", "message") VALUES (1, 2, '7fcb5e811e45e764a59509a2262d14d70b11fc0a', '2020-10-06 12:15:27+00:00', 20, '(#95411) Add payment methods ') RETURNING "sentry_commit"."id" ERROR: null value in column "id" violates not-null constraint DETAIL: Failing row contains (null, 1, 2, 1076315d3f59ddff64b85da5d374e562b6a121fe, 2020-10-06 12:15:28+00, 20, Merge branch 'f/95411' into 'master'.

And I understand the occurrence of an error, since the first value of the insert is just null, but how does the insert work with unmodified tables? They also have null constraints.
It occurs to work with the following tables:

  • sentry_eventmapping
  • sentry_eventtag
  • sentry_eventuser
  • sentry_activity
  • sentry_commit
  • sentry_filtervalue
  • sentry_message

With other tables like nodestore_node and sentry_messagefiltervalue partitioning works. It would be useful to know what the difference between insertions in different tables is and how can I fix partition creation?

Thank you.

We don’t support any kind of partitioning in the way you’re attempting it. While it may work, it’s really going to be luck of the draw. Because of that I don’t feel comfortable providing advice as it is very likely to break in the future.

If you need to break up data I would suggest just splitting off individual tables, rather than trying to horizontally partition a single table.

Additionally you’re on a fairly old version of Sentry I’m guessing, and Sentry 10 uses Clickhouse which already handles partitioning of nodes which drastically reduces the burden on Postgres for events.