Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snuba cleanup dosnt work as expected #6254

Closed
AnubisFUp opened this issue Aug 29, 2024 · 3 comments
Closed

Snuba cleanup dosnt work as expected #6254

AnubisFUp opened this issue Aug 29, 2024 · 3 comments

Comments

@AnubisFUp
Copy link

Hello!

After upgrading sentry from version 22.* to 24.8.0.dev0 clickhouse storage size dosnt reduce even if all events are deleted.

Im not good at managing sentry, but have faced issue for now with clickhouse storage. Cos i clearly dont understand what it store.
I know that nodestore_node table in postgress stores raw events, and see cleanup cron jobs for this in docker-compose.yml, but dont seen nothing for clickhouse.
For now i have read this issue and assume that clickhouse should also clean data that is older than my SENTRY_EVENT_RETENTION_DAYS.

Environment

  • Self-hosted Sentry 24.8.0.dev0
  • SENTRY_EVENT_RETENTION_DAYS=14 in .env file
  • Before upgrade clickhouse-data docker volume size was about ~60Gb
  • After sentry cleanup stil increase size and for now its ~70Gb

Steps to Reproduce

What is done:
Before upgrade to 24.8.0.dev0

  1. manual: sentry cleanup --days 0
  2. manual: psql -d postgres -c "VACUUM FULL nodestore_node"
    After this procedures clickhouse data (not logs) volume still not shirnk in size

After upgrade to 24.8.0.dev0

  1. manual: sentry cleanup --days 0
  2. manual: psql -d postgres -c "VACUUM FULL nodestore_node"
    After this procedures clickhouse data (not logs) volume still not shirnk in size
    After discovering snuba cleanup ive run it -
  3. snuba cleanup --storage transactions --dry-run "False" --clickhouse-host clickhouse --clickhouse-port "9000"
    And this dosnt help either. See actual result

Expected Result

As mentioned int getsentry/self-hosted#1172 (comment)
I assume that clickhouse data will cleanup either, cos before that ive run sentry cleanup --days 0
In web interface for all issues i see only events for 1 day since running sentry cleanup, but klickhouse data still not shrink :(

Actual Result

2024-08-29 06:51:59,390 Initializing Snuba... 2024-08-29 06:52:01,827 Snuba initialization took 2.4378621354699135s 2024-08-29 06:52:01,846 Dropped 0 partitions on clickhouse:9000

61G /var/lib/docker/volumes/sentry-clickhouse/_data/data
9,6G /var/lib/docker/volumes/sentry-clickhouse/_data/store

@volokluev
Copy link
Member

volokluev commented Sep 11, 2024

snuba cleanup only cleans up stale parts, it's likely you don't have any. What are you actually trying to do?

Are you trying to drop all your existing data?

@AnubisFUp
Copy link
Author

AnubisFUp commented Sep 17, 2024

snuba cleanup only cleans up stale parts, it's likely you don't have any. What are you actually trying to do?

Are you trying to drop all your existing data?

hi, yes, I tried to clear the database of all events, but for some reason clickhouse does not clean up

this is my volumes siezes right now for sentry 24.8.0.dev0

92G     /var/lib/docker/volumes/sentry-clickhouse/_data
10M     /var/lib/docker/volumes/sentry-data/_data
6,4G    /var/lib/docker/volumes/sentry-kafka/_data
66G     /var/lib/docker/volumes/sentry-postgres/_data
4,7M    /var/lib/docker/volumes/sentry-redis/_data
3,1G    /var/lib/docker/volumes/sentry-self-hosted_sentry-clickhouse-log/_data
4,0K    /var/lib/docker/volumes/sentry-self-hosted_sentry-kafka-log/_data
24K     /var/lib/docker/volumes/sentry-self-hosted_sentry-nginx-cache/_data
4,0K    /var/lib/docker/volumes/sentry-self-hosted_sentry-secrets/_data
268K    /var/lib/docker/volumes/sentry-self-hosted_sentry-smtp/_data
396K    /var/lib/docker/volumes/sentry-self-hosted_sentry-smtp-log/_data
4,0K    /var/lib/docker/volumes/sentry-self-hosted_sentry-vroom/_data
456K    /var/lib/docker/volumes/sentry-self-hosted_sentry-zookeeper-log/_data
158M    /var/lib/docker/volumes/sentry-symbolicator/_data
1,4M    /var/lib/docker/volumes/sentry-zookeeper/_data

The clickhouse volume is larger than the postgres itself, how can I clear it? what data does clickhouse store and what can I do in my case? I store data for one week, I doubt that the data for this period can weigh 90GB

[UPDATE]
Im inspect clikhouse and find out next table sizes

spans_local                                        │ 35.22 GiB
generic_metric_distributions_aggregated_local      │ 29.84 GiB
outcomes_raw_local                                 │ 9.80 GiB
...

after a little research, found out that the data in the spans_local and generic_metric_distributions_aggregated_local tables probably refer to the perfomance->transactions tab in the sentry web interface

should these tables take up so much space? are there any policies to periodically clear them or turn them off altogether since I don't need transaction data

@nikhars
Copy link
Member

nikhars commented Sep 30, 2024

Hello. If you wish to reclaim the disk space, you can run the following command by getting into the clickhouse console

  1. Get to the clickhouse console of the docker container using the following command
docker exec -it sentry-self-hosted-clickhouse-1 clickhouse-local
de6affbe95eb :)

Use the truncate clickhouse command to get rid of all the tables you do not need.

TRUNCATE TABLE <table_name>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants