Feature Request: Allow configurable shard range format thresholds for more flexible scaling #17688
Labels
Component: Cluster management
Type: Enhancement
Logical improvement (somewhere between a bug and feature)
Feature Description
key.GenerateShardRanges uses 2 hex digits in shard ranges when there are 256 or fewer shards.
vitess/go/vt/key/key.go
Lines 387 to 394 in 30c09f5
Make the transition threshold (from 2-digit hex to 4-digit hex formatting) configurable, along with the needed changes in vitess-operator.
This would allow a keyspace at 128 shards to expand incrementally to a non power-of-two number (e.g., 160) that fits operational needs without the overhead of jumping all the way to 256 shards (the next power-of-two number after 128).
It would also prevent the uneven distribution caused when staying on two-digit hex ranges on non power-of-two shards.
Use Case(s)
We currently operate multiple sharded keyspaces where each keyspace uses a power-of-two number of shards. This helps avoid uneven distribution (or "banding") in the shard ranges. However, once a keyspace reaches 128 shards, the next power-of-two step is 256 shards—which can be excessive and wasteful if the keyspace truly only needs something like 160 shards.
We have observed that using two-digit hexadecimal ranges (%02x) on non power-of-two shards can cause distribution banding.
In smaller keyspaces, such as with 10 shards, the banding isn't that noticeable, the banding gets worse with increase in the number of shards in the keyspace. Switching to four-digit hexadecimal ranges (%04x) eliminates this banding but currently only happens when the shard count exceeds 256. If we lower the threshold to 128, we could comfortably scale from 128 to, say, 160 shards using four-digit ranges without incurring uneven data distribution.
We tested this internally and observed the benefits of using 4 digit ranges vs 2 digit ranges for 160 shards.
160 shards created with 2 digit shard ranges showing banding
160 shards created with 4 digit shard ranges showing no banding
This Feature Request along with #15744 would provide the utmost flexibility.
The text was updated successfully, but these errors were encountered: