Skip to content

Optimize batch-read to avoid wasted disk IO#4741

Open
dao-jun wants to merge 4 commits intoapache:masterfrom
dao-jun:dev/optimize_batch_read
Open

Optimize batch-read to avoid wasted disk IO#4741
dao-jun wants to merge 4 commits intoapache:masterfrom
dao-jun:dev/optimize_batch_read

Conversation

@dao-jun
Copy link
Copy Markdown
Member

@dao-jun dao-jun commented Apr 9, 2026

Motivation

Batch reads currently fetch the next entry before checking whether it can still fit within the maxSize budget. When the last entry in the batch exceeds the remaining space, the read result is discarded, but the disk IO has already happened. This creates wasted IO on the critical read path.

This change avoids that extra read while preserving the existing batch-read behavior, including returning the first entry even when a single entry alone exceeds the requested maxSize.

Changes

  • Add a bounded-read path (readEntryIfFits / getEntryIfFits) through the batch-read stack, from BatchedReadEntryProcessor down to Bookie, LedgerDescriptor, LedgerStorage, and EntryLogger.
  • Update batch-read logic to:
    • read the first entry as before
    • compute the remaining response budget for subsequent entries
    • stop without reading the next entry when it cannot fit
  • Implement size-aware entry reads in both DefaultEntryLogger and DirectEntryLogger by checking entry size from metadata before loading the full payload.
  • Wire the bounded-read path through all storage implementations, including:
    • InterleavedLedgerStorage
    • SortedLedgerStorage
    • DbLedgerStorage / SingleDirectoryDbLedgerStorage
  • Keep framing semantics consistent by treating the per-entry 4-byte delimiter as part of the size budget.
  • Update mock batch-read behavior to match the production semantics.
  • Add regression and boundary tests covering:
    • first entry larger than maxSize
    • exact-fit remaining budget
    • oversized subsequent entries
    • default entry logger and direct entry logger paths
    • interleaved, sorted, and DB ledger storage paths
    • mock and client-facing batch-read behavior

In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
checks for pull requests. A pull request can only be merged when it passes precommit checks.


Be sure to do all the following to help us incorporate your contribution
quickly and easily:

If this PR is a BookKeeper Proposal (BP):

  • Make sure the PR title is formatted like:
    <BP-#>: Description of bookkeeper proposal
    e.g. BP-1: 64 bits ledger is support
  • Attach the master issue link in the description of this PR.
  • Attach the google doc link if the BP is written in Google Doc.

Otherwise:

  • Make sure the PR title is formatted like:
    <Issue #>: Description of pull request
    e.g. Issue 123: Description ...
  • Make sure tests pass via mvn clean apache-rat:check install spotbugs:check.
  • Replace <Issue #> in the title with the actual Issue number.

@dao-jun dao-jun closed this Apr 9, 2026
@dao-jun dao-jun reopened this Apr 9, 2026
Copy link
Copy Markdown
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find tests about failure scenarios.

Memory leaks can happen in case of failures (reading from storage...), client may see the wrong error code.....

@dao-jun
Copy link
Copy Markdown
Member Author

dao-jun commented Apr 13, 2026

@eolivelli I added 3 tests in BatchedReadEntryProcessorTest, PTAL

@dao-jun dao-jun requested a review from eolivelli April 13, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants