[EMPSRE-1029] Add options.storage for pre-disk content validation#3
Merged
Merged
Conversation
Adds a multer-1.x-style StorageEngine hook so consumers can take over how each file part is persisted — buffer the first 4 KB, run a magic-byte gate, then either pipe the rest to disk or destroy the part stream. No fs.createWriteStream is opened until validation passes. Without options.storage, behavior is identical to before (multiparty's autoFiles=true → built-in handleFile writes to uploadDir, 'file' event populates req.files). This is fully backward-compatible. With options.storage: - autoFields stays on so text fields still come through unchanged. - autoFiles is forced off so multiparty emits raw 'part' events. - For each part with a filename, we pre-populate the same publicFile shape consumers expect (fieldName, originalFilename, name, type, headers, size: 0) and hand it + the part stream to the storage callback. The storage engine is expected to fill publicFile.path and publicFile.size before calling cb(null), or destroy(part) and call cb(err) on rejection. - A per-request in-flight counter ensures connect-multiparty waits for every storage callback to resolve before calling next() — multiparty's 'close' event fires when part STREAMS end, but storage may still be flushing bytes to disk after that. 3 new tests cover the happy path, error-translates-to-400, and async storage that completes after multiparty's 'close'.
Document the maxFilesSize bypass
- multiparty enforces maxFilesSize via LimitStream inside handleFile.
With autoFiles=false (forced when options.storage is set) file parts
go to handlePart and the LimitStream is not inserted. Callers who
rely on the default 100 MB cap need to know that providing a storage
engine moves size enforcement into their engine. Documented in the
storage section of the docstring.
Drop parts that arrive after the request has settled
- A storage callback erroring sets done=true (via form.emit('error')
→ existing handler), but multiparty keeps parsing the drained
request and may emit more 'part' events for parts that hadn't been
parsed yet. Without a guard those kick off new storage work whose
results have nowhere to go. Same applies to events buffered past
finishParse(). Added an `if (done) { part.resume(); return; }` at
the top of the part handler.
Drop the dead !part.filename branch
- With autoFields=true / autoFiles=false multiparty routes filename-
less parts to handleField → emits 'field'; only file parts reach
the 'part' event. The defensive branch was unreachable. Removed.
Tighten cb(err) docstring
- Destroying the part stream doesn't release multiparty's parser
(parser waits for partStream.on('end'); destroy() doesn't emit
'end'). It is form.emit('error') in the wrapper that drives the
response via req.resume() + onFinished. The docstring now says
destroy is for resource hygiene; form.emit('error') is what releases
the parser.
Add a field-parts-with-storage test
- The PR docstring promises text fields still come through when
storage is set. Added a test that mixes .field('user', 'Tobi')
with .attach('resume', ...) and asserts both req.body.user and the
file's publicFile payload.
|
@dwinrick-lever LGTM. However, Claude flagged one small thing. |
…rash the process
Multiparty's handleFile attaches its own part.on('error') listener
(multiparty/index.js:696). handlePart — the path file parts take when
options.storage is set — does NOT. If the client aborts mid-upload,
multiparty's handleError calls errorEventQueue which directly does
eventEmitter.emit('error', err) on the part stream
(multiparty/index.js:644-645). An unhandled 'error' event crashes the
Node process.
The wrapper now installs a noop part.on('error') listener BEFORE
invoking the storage engine, so engines that forget to attach their
own listener are still safe. Storage engines that want to clean up
their own state on abort can still attach their own listeners on top
— Node EventEmitter allows multiple listeners. The form-level error
handler in the wrapper still drives the 400 response either way.
Also:
- Updated the three other new tests to attach part.on('error') so
they model the correct pattern for real-world storage engines.
- New regression test asserts the wrapper's listener is present
BEFORE the storage engine attaches anything of its own. Verified
by removing the fix and watching this test fail.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a multer-1.x-style StorageEngine hook so consumers can take over how each file part is persisted — buffer the first few KB, run a magic-byte gate, then either pipe the rest to disk or destroy the part stream. No `fs.createWriteStream` is opened until validation passes.
This unblocks the postings2 side of EMPSRE-1029 (the hook-receiver side landed in lever/hook-receiver#842 via multer's native StorageEngine). postings2 uses connect-multiparty globally; with this option it can apply the same magic-byte gate that hook-receiver does, without an upstream multer migration.
Behavior
Without `options.storage` (default): identical to current behavior. multiparty's `autoFiles=true` shortcut opens the write stream and pipes to `uploadDir`; we emit `'file'` and populate `req.files`. Fully backward-compatible.
With `options.storage`:
```js
app.use(multipart({
uploadDir: '/tmp',
storage: function (req, part, publicFile, cb) {
// buffer first ~4 KB → validate → pipe to disk or destroy(part)
}
}))
```
Why the in-flight counter
multiparty's `'close'` event fires when every part STREAM has ended, but the storage engine may still be flushing bytes to disk (writeStream finish, S3 upload, etc.) after that. Without tracking pending storage callbacks, `next()` could fire before `publicFile.size` is final. The wrapper now waits for every storage `cb()` to resolve before finalizing.
Test plan