Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-18522: Slice records for share fetch #18804

Open
wants to merge 10 commits into
base: trunk
Choose a base branch
from

Conversation

apoorvmittal10
Copy link
Collaborator

The PR handles slicing of fetched records based on acquire response for share fetch. There could be additional bytes fetched from log but acquired offsets can be a subset, typically with max fetch records configuration. Rather sending additional bytes of fetched data to client we should slice the file and wire only needed batches.

Note: If the acquired offsets are within a batch then we need to send the entire batch within the file record. Hence rather checking for inidividual batches, PR finds the first and last acquired offset, and trims the file for all batches between (inclusive) these two offsets.

@github-actions github-actions bot added triage PRs from the community core Kafka Broker KIP-932 Queues for Kafka build Gradle build or GitHub Actions labels Feb 4, 2025
@apoorvmittal10 apoorvmittal10 removed the triage PRs from the community label Feb 4, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 9 changed files in this pull request and generated 1 comment.

Files not reviewed (4)
  • gradle/spotbugs-exclude.xml: Language not supported
  • core/src/test/java/kafka/server/share/ShareFetchUtilsTest.java: Evaluated as low risk
  • core/src/test/java/kafka/server/share/SharePartitionManagerTest.java: Evaluated as low risk
  • core/src/test/java/kafka/server/share/DelayedShareFetchTest.java: Evaluated as low risk
Comments suppressed due to low confidence (3)

core/src/test/java/kafka/server/share/SharePartitionTest.java:5898

  • Consider overloading the fetchAcquiredRecords method instead of adding a boolean flag for subsetAcquired.
private List<AcquiredRecords> fetchAcquiredRecords(

core/src/test/java/kafka/server/share/SharePartitionTest.java:5914

  • Ensure that the memoryRecordsBuilder method from ShareFetchTestUtils is used consistently across all tests.
return memoryRecordsBuilder(numOfRecords, startOffset).build();

server/src/main/java/org/apache/kafka/server/share/fetch/ShareAcquiredRecords.java:45

  • Ensure that the subsetAcquired flag is properly tested to verify the slicing logic works correctly when this flag is true or false.
private final boolean subsetAcquired;

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 9 changed files in this pull request and generated 1 comment.

Files not reviewed (4)
  • gradle/spotbugs-exclude.xml: Language not supported
  • core/src/main/java/kafka/server/share/ShareFetchUtils.java: Evaluated as low risk
  • core/src/main/java/kafka/server/share/SharePartition.java: Evaluated as low risk
  • core/src/test/java/kafka/server/share/DelayedShareFetchTest.java: Evaluated as low risk

Copy link
Contributor

@clolov clolov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carried out an initial pass 😊, thank you, this has been an interesting read!

@apoorvmittal10
Copy link
Collaborator Author

Carried out an initial pass 😊, thank you, this has been an interesting read!

Thanks for taking a look, indeed it's an intersesting change to optimize the transferred bytes.

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apoorvmittal10 : Thanks for the PR. Left a few comments.

* of the fetched records. Otherwise, the original records are returned.
*/
static Records maybeSliceFetchRecords(Records records, ShareAcquiredRecords shareAcquiredRecords) {
if (!shareAcquiredRecords.subsetAcquired() || !(records instanceof FileRecords fileRecords)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, with remote storage, it's possible for records to be of MemoryRecords. It would be useful to slice it too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I do not see any specific slicing API in memory records. Do you think I should add one? Or there exists some way already?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is no slicing API in memory records. So, we will need to add one.

Copy link
Collaborator Author

@apoorvmittal10 apoorvmittal10 Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I do it int this PR itself or another PR/task? Else this PR will get too long to include new API in memory records and respective individual tests, whiile integrating here. I ll prefer separately.

for (FileChannelRecordBatch batch : fileRecords.batches()) {
// If the batch base offset is less than the first acquired offset, then the start position
// should be updated to skip the batch.
if (batch.baseOffset() < firstAcquiredOffset) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure why we need to maintain previousBatch below. Could set just set startPosition when batch.lastOffset() is >= firstAcquiredOffset for the first time?

Copy link
Collaborator Author

@apoorvmittal10 apoorvmittal10 Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are correct that would have been the easiest way. But as lastOffset of batch loads headers hence I have avoided that call by maintaining previousBatch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explanation. Got it.

I was thinking if we could make the code a bit easier to understand. Specially, rename previousBatch to sth like mayOverlapBatch. Instead of first increasing startPosition and later decreasing it, we only increase startPosition when we are sure mayOverlapBatch indeed overlaps in the next iteration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junrao I re-thought about the solution. And tried to simplify it, also got complete rid of lastOffset() method call. See if it makes better sense now. Also have skipped any calculation if there exists single fetched batch, then we should not consider any slicing, as the acquired records should always be within fetched data.

core/src/main/java/kafka/server/share/SharePartition.java Outdated Show resolved Hide resolved
assertEquals(7, recordBatches.get(0).baseOffset());
assertEquals(10, recordBatches.get(0).lastOffset());

// Acquire including gaps between batches, should return 2 batches.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, there are no gaps btw batches, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gap is at offset 5 and 6 hence the check just validates that there occurs no issue when acquired near gap boundaries.

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an area of code that I feel qualified to review authoritatively. However, I see that the changes seem compatible with my thoughts about how to handle isolation level when we tackle that feature.

@apoorvmittal10 apoorvmittal10 requested a review from junrao February 7, 2025 16:44
@apoorvmittal10
Copy link
Collaborator Author

Thanks for the review @junrao, I have addressed and replied to the comments. I also have one doubt regrding MemoryRecords, please if you can guide.

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apoorvmittal10 : Thanks for the updated PR. A few more comments.

for (FileChannelRecordBatch batch : fileRecords.batches()) {
// If the batch base offset is less than the first acquired offset, then the start position
// should be updated to skip the batch.
if (batch.baseOffset() < firstAcquiredOffset) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explanation. Got it.

I was thinking if we could make the code a bit easier to understand. Specially, rename previousBatch to sth like mayOverlapBatch. Instead of first increasing startPosition and later decreasing it, we only increase startPosition when we are sure mayOverlapBatch indeed overlaps in the next iteration.

* of the fetched records. Otherwise, the original records are returned.
*/
static Records maybeSliceFetchRecords(Records records, ShareAcquiredRecords shareAcquiredRecords) {
if (!shareAcquiredRecords.subsetAcquired() || !(records instanceof FileRecords fileRecords)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is no slicing API in memory records. So, we will need to add one.

@apoorvmittal10
Copy link
Collaborator Author

Hi @junrao, can you please re-review the simplified solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Gradle build or GitHub Actions ci-approved core Kafka Broker KIP-932 Queues for Kafka
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants