Skip to content

refactor(server): inline file hashing via Transform stream#27280

Open
MartB wants to merge 2 commits intoimmich-app:mainfrom
MartB:main
Open

refactor(server): inline file hashing via Transform stream#27280
MartB wants to merge 2 commits intoimmich-app:mainfrom
MartB:main

Conversation

@MartB
Copy link
Copy Markdown

@MartB MartB commented Mar 25, 2026

Description

This change refactors the original implementation to avoid an anti-pattern and improve back pressure handling This assumption was wrong see comments below.

Fixes nothing so far

How Has This Been Tested?

  • Test A
  • Test B

Screenshots (if appropriate)

Checklist:

  • I have carefully read CONTRIBUTING.md
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if applicable
  • I have no unrelated changes in the PR.
  • I have confirmed that any new dependencies are strictly necessary.
  • I have written tests for new code (if applicable)
  • I have followed naming conventions/patterns in the surrounding code
  • All code in src/services/ uses repositories implementations for database calls, filesystem operations, etc.
  • All code in src/repositories/ is pretty basic/simple and does not have any immich specific logic (that belongs in src/services/)

Please describe to which degree, if any, an LLM was used in creating this pull request.

None

@github-actions
Copy link
Copy Markdown
Contributor

This PR has been automatically closed as the description doesn't follow our template. After you edit it to match the template, the PR will automatically be reopened.

@github-actions github-actions bot closed this Mar 25, 2026
@MartB MartB changed the title refactor(server/upload): use Transform stream for file hashing to fix backpressure refactor(server/upload): use transform stream for file hashing to fix backpressure Mar 25, 2026
@MartB
Copy link
Copy Markdown
Author

MartB commented Mar 25, 2026

@danieldietzler Thats what i get for not sticking to the template even for simple edits i guess! Is the reopen broken or did i miss something?

@danieldietzler danieldietzler requested a review from jrasm91 March 25, 2026 23:15
@jrasm91 jrasm91 changed the title refactor(server/upload): use transform stream for file hashing to fix backpressure refactor(server): use transform stream for file hashing to fix backpressure Mar 25, 2026
Copy link
Copy Markdown
Member

@mertalev mertalev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what the anti-pattern here is and what this PR actually improves? I just see a Transform abstraction added here that doesn't seem to do anything beneficial compared to the original. I had to look up what it does, which isn't really a point in favor of the change.

Edit: looking at it more closely, there's no backpressure benefit here either. The hashing is synchronous in a single-threaded language; backpressure doesn't apply. The Transform is just indirection.

@MartB
Copy link
Copy Markdown
Author

MartB commented Mar 26, 2026

@mertalev
I dug into this again and you're correct that backpressure was never broken here because pipeline pauses the underlying stream (and its 'data' events) automatically. I misinterpreted the node stream documentation thinking .pipeline could not pause if an .on listener was attached. The commit message is definitely misleading.

However the real reason this was spotted initially, is because it mixes automatic (pipeline) with manual (.on(“data”)) code. So we are left with "just" a code quality / hardening improvement.

@MartB MartB changed the title refactor(server): use transform stream for file hashing to fix backpressure refactor(server): inline file hashing via Transform stream Mar 26, 2026
@jrasm91 jrasm91 enabled auto-merge (squash) March 26, 2026 18:37
@uhthomas
Copy link
Copy Markdown
Collaborator

Yeah I thought about this too. Using a transform stream might be clearer maybe in the order of operations of the pipeline, but this isn't a bug fix and it doesn't make the behaviour any better or worse. It's just stylistic, which is why I didn't do it.

Copy link
Copy Markdown
Collaborator

@uhthomas uhthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to do this, let's flesh it out more. The transform stream should destroy the hash, not the pipeline.

@uhthomas uhthomas disabled auto-merge March 26, 2026 19:20
@uhthomas
Copy link
Copy Markdown
Collaborator

uhthomas commented Mar 26, 2026

Let's also maybe rename it? It's not just hashing, it's also tracking the size of the file.

@uhthomas
Copy link
Copy Markdown
Collaborator

The other trouble with using transforms is that there is an opportunity to break it. The event listener version is purely read-only, and the code makes that intention clear.

@mertalev
Copy link
Copy Markdown
Member

Yeah, it's kind of like calling array.map, doing a side effect and returning the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants