StatusCommand: Optimize git status by deferring getObjectId() calls#9304
Conversation
|
the description makes sense to me. Unfortunately this causes several test failures which I could also reproduce locally (same module). Those would have to be investigated first before this can proceed. |
|
@mbien , it's getting a bit more complicated :) I wonder if the tests are really valid and it's required that files need to be present in the resulting map if they are not changed. The Commit dialog always calls the StatusCommand for a directory, not for a single file. For now, to please the tests, and potentially some real functionality that needs to get status for a single file, I modified the solution (in the second commit) to apply only for directories (where it omits the unchanged files from the statuses map) but it keeps the file in the statuses map if called for a single file. A better approach would be to preload file status and parallelize computing hashes with fti.isModified(indexEntry, true,...), so that it doesn't block main thread. This, however, requires a lot of refactoring, so let's try a simple approach first. I asked AI to analyze how GIT does it, and it said that it does it in a similar way - first preloads status info from filesystem in parallel, computes hashes from content if needed, and then goes through the precomputed information to build the result. P.S. I'm adding commits, without squashing, to keep the history until find the best solution. Then I'll squash commits. |
52ee181 to
d72d934
Compare
|
@mbien , I dug deeper and found out that there are better ways to optimize the current algorithm, without skipping any files. I started from scratch and managed to reduce the time spent in the This reduces the whole time it takes to compute files statuses on the Netbeans repo in the Commit dialog from about 17s to about 14s. And I have even better news - I managed to speed up another part of the |
|
Here's the other PR: #9324 |
|
@OndroMih could you do a so that this is getting tested on top of your other PR? |
d72d934 to
068b48c
Compare
|
Done, @mbien . |
| @@ -200,7 +208,6 @@ | |||
| } else { | |||
| continue; | |||
| } | |||
| } | |||
There was a problem hiding this comment.
i believe this modification has the exact same amount of isEntryIgnored() calls like before, no?
if nWorking == TREE -> 1 call in both cases and if its not a tree -> 0 calls in both cases.
I reverted this section and left the supplier modification and got similar results (ran the action 5 times each)
reverted:
2184, 1942, 1870, 1820, 1821
PR:
2196, 1873, 1864, 1865, 1787
There was a problem hiding this comment.
Yes, you are right. I thought that this optimizes the number of calls to isEntryIgnored() but it's still being called later. To be honest, here I used some AI to help me understand what's the code doing. AI suggested this optimization and it looked OK to me, but now I see that it only shuffled the order of execution and didn't optimize anything.
I'll revert these changes.
There was a problem hiding this comment.
Reverted.
The only real thing that remains is the lazy evaluation of object ids which is the only thing that speeds the execution.
There was a problem hiding this comment.
AI suggested this optimization and it looked OK to me, but now I see that it only shuffled the order of execution and didn't optimize anything.
thanks for reverting. I suspected some form of LLM involvement when I saw that.
will merge later once CI is green so that this is getting in for NB 30
Speeds up GitClient.getStatus() by deferring expensive evaluation of object Ids, which often compute file content hash, to evaluate them lazily only when needed. On Netbeans repository with a lot of files, this speeds up GitClient.getStatus() execution from 4 seconds to 1 second.
068b48c to
d98f086
Compare
Speeds up GitClient.getStatus() by deferring expensive evaluation of object Ids, which often compute file content hash, to evaluate them lazily only when needed.
On NetBeans repository with a lot of files, this speeds up GitClient.getStatus() execution from 4 seconds to 1 second.
PR approval and merge checklist:
If this PR targets the delivery branch: don't merge. (full wiki article)