-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce snapshots repository: integrate snapshot writing with S3 via object_store #2274
base: feat/automatic-snapshots-every-n-records
Are you sure you want to change the base?
Introduce snapshots repository: integrate snapshot writing with S3 via object_store #2274
Conversation
Introduce a new configurable number of records property after which a snapshot of the partition store will be automatically taken.
.with_credentials(Arc::new(AwsSdkCredentialsProvider { | ||
credentials_provider: DefaultCredentialsChain::builder().build().await, | ||
})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes object_store
much nicer to use with S3; it natively integrates with IMDS out of the box but doesn't understand things like AWS_PROFILE
and the commonly used dotfiles used to configure AWS CLI/SDKs. It also ensures dynamic refresh when e.g. SSO session expires and the user renews it.
staging_path: PathBuf, | ||
} | ||
|
||
impl SnapshotRepository { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a follow-up PR, I will introduce the ability to list recent snapshots available from the repository.
} | ||
|
||
#[async_trait] | ||
impl object_store::CredentialProvider for AwsSdkCredentialsProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be something useful to contribute back upstream to object_store
.
…ots-every-n-records
451cb20
to
24facc8
Compare
1d97374
to
17e437f
Compare
Is this ready for review? (noticed that I'm a review and it's a draft still) |
@AhmedSoliman - early feedback welcome but only if you want to! I still need to rebase this on the latest #2253 so maybe don't until that's done. I'll remove the reviewers until it is. |
This change introduces the
SnapshotRepository
which encapsulates publishing the raw RocksDB column family exported data + custom metadata to a [potentially] remote destination. We add a dependency onobject_store
which supports all major cloud providers' blob stores, plus local filesystem.By default, the snapshot repository is the local
restate-data/pp-snapshots
directory, but a new optional config key allows this to be configured to an S3 URL. We can trivially support other destinations by enabling additional features in theobject_store
create.Sample snapshot publishing (triggered with
restatectl snap create-snapshot -p 1
):The relevant configuration for this is:
The store layout is currently:
The key structure is:
[<prefix>/]<partition_id>/<sort_key>/<snapshot_id>_<lsn>.tar
.s3://my-bucket/custom/cluster/prefix
Open questions
Outstanding tasks
metadata.json
object to the object store, separate from the tar file; I think this is a must as it will allow us to determine e.g. the version of the snapshot before we download the data; this gives us an easy migration path to support different data formats in the future