Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pageserver): filter keys with gc-compaction #9004

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skyzh
Copy link
Member

@skyzh skyzh commented Sep 16, 2024

Problem

Part of #8002

Close #8920

Legacy compaction (as well as gc-compaction) rely on the GC process to remove unused layer files, but this relies on many factors (i.e., key partition) to ensure data in a dropped table can be eventually removed.

In gc-compaction, we consider the keyspace information when doing the compaction process. If a key is not in the keyspace, we will skip that key and not include it in the final output.

However, this is not easy to implement because gc-compaction considers branch points (i.e., retain_lsns) and the retained keyspaces could change across different LSNs. Therefore, for now, we only remove aux v1 keys in the compaction process.

Summary of changes

  • Add FilterIterator to filter out keys.
  • Integrate FilterIterator with gc-compaction.
  • Add collect_gc_compaction_keyspace for a spec of keyspaces that can be retained during the gc-compaction process.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@skyzh skyzh requested a review from a team as a code owner September 16, 2024 01:02
@skyzh skyzh requested a review from arpad-m September 16, 2024 01:02
Copy link

@orca-security-us orca-security-us bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

@skyzh skyzh requested a review from problame September 16, 2024 01:03
Copy link

4416 tests run: 4260 passed, 0 failed, 156 skipped (full report)


Flaky tests (3)

Postgres 17

Test coverage report is not available

The comment gets automatically updated with the latest test results
d374220 at 2024-09-16T01:54:22.896Z :recycle:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gc-compaction: filter keys not in shard / not in keyspace
1 participant