You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe
Current Kudu reader only supports one split strategy, i.e. SIMPLE_DIVIDE which simply evenly divide a integer range into several sub-ranges as splits.
Document reference: connector-kudu
This split strategy has shortages:
User has to determine a integer type column (int8, int16, int32, int64) as split dimension.
If user does not know the lower and upper bound, it will scan the whole table to get the actual lower and upper bound.
It does not support null value in the dimension.
So this issue wants someone(s) to optimize current split strategy or implement other split strategies.
Describe the solution you'd like
Similar to KuduTableInputFormat in kudu-mapreduce, may be we can let user directly set serialized KuduPredicates in configuration files.
KuduTable supports List<Partition> getRangePartitions(long timeout) method. This method can get all range partitions in the table. Maybe one can directly use these partitioned ranges as splits.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe
Current Kudu reader only supports one split strategy, i.e. SIMPLE_DIVIDE which simply evenly divide a integer range into several sub-ranges as splits.
Document reference: connector-kudu
This split strategy has shortages:
So this issue wants someone(s) to optimize current split strategy or implement other split strategies.
Describe the solution you'd like
kudu-mapreduce
, may be we can let user directly set serialized KuduPredicates in configuration files.List<Partition> getRangePartitions(long timeout)
method. This method can get all range partitions in the table. Maybe one can directly use these partitioned ranges as splits.Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: