HBASE-30038: RefCnt Leak error when caching#7995
HBASE-30038: RefCnt Leak error when caching#7995dParikesit wants to merge 5 commits intoapache:masterfrom
Conversation
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
Outdated
Show resolved
Hide resolved
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
Outdated
Show resolved
Hide resolved
|
Please consider @vaijosh suggestions and also fix spotless issues. |
|
Thanks for the suggestions. I'll get back to you after I fix it. |
|
Any news on this, @dParikesit ? Please get the reviews addressed so that we can move forward here, as this is an important fix you have found. |
|
@wchevreuil thanks for the reminder. I have modified the tests to use retry-loop instead of sleep (including the test added in HBASE-28890 ). I've tested it on my machine. Can you help trigger the CI pipeline? |
| for (int i = 0; i < 15 && counter.get() == 0; i++) { | ||
| System.gc(); | ||
| alloc.allocate(128 * 1024).release(); | ||
| } |
There was a problem hiding this comment.
@dParikesit
You have not added sleep inside the loop so it will finish quicky and deafeat the purpose.
Please keep it something like below
for (int i = 0; i < 30 && counter.get() == 0; i++) {
System.gc();
alloc.allocate(128 * 1024).release();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
break;
}
}
There was a problem hiding this comment.
Thanks for the feedback! I believe the sleep should be done after we run allocate.release() right? I have updated the commit to change it.
| for (int i = 0; i < 15 && counter.get() == 0; i++) { | ||
| System.gc(); | ||
| allocator.allocate(128 * 1024).release(); | ||
| } |
There was a problem hiding this comment.
@dParikesit
Please add Put slip inside the loop and make 30 retries instead of 15
vaijosh
left a comment
There was a problem hiding this comment.
Thanks @dParikesit.
Changes LTGM
JIRA: HBASE-30038
This bug is similar to HBASE-28890
HFileBlock.Writer.getBlockForCaching() creates a ref-counted HFileBlock, and the caller must release its own reference after the cache has taken the ownership it needs.
In HFileWriterImpl.java, the leak happened when shouldCacheBlock() returned before cacheFormatBlock.release() ran.
In HFileBlockIndex.java, the intermediate-index path cached blockForCaching but never released the reference at all.
Because these blocks can be backed by pooled off-heap ByteBuffs, repeated HFile writes could steadily drain allocator buffers and effectively leak memory, even though the blocks were only meant to live long enough to be considered for cache-on-write.