Streaming Middleware in Node.js: Transform Large HTTP Responses Without Buffering

Most Node.js middleware assumes buffered bodies — great for JSON APIs, but terrible for performance when working with large files, proxied responses, or real-time content. In this article, you’ll learn how to build streaming middleware in Node.js that operates on-the-fly, without ever buffering the full response in memory — ideal for: HTML injection without latency JSON rewriting in proxies Compression/encryption on the fly Streaming large logs/files through a filter Step 1: Understand the Problem with Traditional Middleware Common middleware like body-parser, or response rewrites in express, assume the request/response is buffered: // This blocks until the full response is received app.use((req, res, next) => { let chunks = []; res.on('data', chunk => chunks.push(chunk)); res.on('end', () => { const body = Buffer.concat(chunks).toString(); // modify body here (too late for streaming) }); }); This doesn't scale. For large files or real-time proxies, we want transformations mid-stream, before the full body is received. Step 2: Use on-headers to Hook Into Streaming Response We’ll write a middleware that intercepts the response before headers are sent, and replaces res.write and res.end with our own streaming pipeline. Install on-headers: npm install on-headers Then create a middleware like this: const onHeaders = require('on-headers'); const { Transform } = require('stream'); function streamingTransformMiddleware(rewriteFn) { return (req, res, next) => { const originalWrite = res.write; const originalEnd = res.end; const transformStream = new Transform({ transform(chunk, encoding, callback) { const output = rewriteFn(chunk.toString()); callback(null, output); } }); // Delay piping until headers are about to be sent onHeaders(res, () => { res.write = (...args) => transformStream.write(...args); res.end = (...args) => transformStream.end(...args); transformStream.on('data', (chunk) => originalWrite.call(res, chunk)); transformStream.on('end', () => originalEnd.call(res)); }); next(); }; } This replaces the write stream with a transform that operates chunk-by-chunk, ideal for streaming. Step 3: Use It in Your Express App Let’s apply a simple example: rewrite every instance of “dog” to “cat” in streamed HTML: app.use(streamingTransformMiddleware((chunk) => { return chunk.replace(/dog/g, 'cat'); })); Now this middleware will modify every chunk as it’s being streamed to the client — no full buffer, no delay. You can also pipe incoming proxy streams (e.g., via http-proxy) directly through this transform for on-the-fly rewriting. Step 4: Bonus – Add Compression in the Same Stream Need gzip on top? Just add another transform layer using zlib: const zlib = require('zlib'); const gzip = zlib.createGzip(); transformStream .pipe(gzip) .on('data', (chunk) => originalWrite.call(res, chunk)) .on('end', () => originalEnd.call(res)); This allows stacked streaming transforms, such as: HTML injection Content rewriting Minification Compression All in a single pass, fully streamed. ✅ Pros:

May 1, 2025 - 05:20
 0
Streaming Middleware in Node.js: Transform Large HTTP Responses Without Buffering

Most Node.js middleware assumes buffered bodies — great for JSON APIs, but terrible for performance when working with large files, proxied responses, or real-time content.

In this article, you’ll learn how to build streaming middleware in Node.js that operates on-the-fly, without ever buffering the full response in memory — ideal for:

  • HTML injection without latency
  • JSON rewriting in proxies
  • Compression/encryption on the fly
  • Streaming large logs/files through a filter

Step 1: Understand the Problem with Traditional Middleware

Common middleware like body-parser, or response rewrites in express, assume the request/response is buffered:


// This blocks until the full response is received
app.use((req, res, next) => {
  let chunks = [];
  res.on('data', chunk => chunks.push(chunk));
  res.on('end', () => {
    const body = Buffer.concat(chunks).toString();
    // modify body here (too late for streaming)
  });
});

This doesn't scale. For large files or real-time proxies, we want transformations mid-stream, before the full body is received.

Step 2: Use on-headers to Hook Into Streaming Response

We’ll write a middleware that intercepts the response before headers are sent, and replaces res.write and res.end with our own streaming pipeline.

Install on-headers:


npm install on-headers

Then create a middleware like this:


const onHeaders = require('on-headers');
const { Transform } = require('stream');

function streamingTransformMiddleware(rewriteFn) {
  return (req, res, next) => {
    const originalWrite = res.write;
    const originalEnd = res.end;

    const transformStream = new Transform({
      transform(chunk, encoding, callback) {
        const output = rewriteFn(chunk.toString());
        callback(null, output);
      }
    });

    // Delay piping until headers are about to be sent
    onHeaders(res, () => {
      res.write = (...args) => transformStream.write(...args);
      res.end = (...args) => transformStream.end(...args);

      transformStream.on('data', (chunk) => originalWrite.call(res, chunk));
      transformStream.on('end', () => originalEnd.call(res));
    });

    next();
  };
}

This replaces the write stream with a transform that operates chunk-by-chunk, ideal for streaming.

Step 3: Use It in Your Express App

Let’s apply a simple example: rewrite every instance of “dog” to “cat” in streamed HTML:


app.use(streamingTransformMiddleware((chunk) => {
  return chunk.replace(/dog/g, 'cat');
}));

Now this middleware will modify every chunk as it’s being streamed to the client — no full buffer, no delay.

You can also pipe incoming proxy streams (e.g., via http-proxy) directly through this transform for on-the-fly rewriting.

Step 4: Bonus – Add Compression in the Same Stream

Need gzip on top? Just add another transform layer using zlib:


const zlib = require('zlib');

const gzip = zlib.createGzip();

transformStream
  .pipe(gzip)
  .on('data', (chunk) => originalWrite.call(res, chunk))
  .on('end', () => originalEnd.call(res));

This allows stacked streaming transforms, such as:

  • HTML injection
  • Content rewriting
  • Minification
  • Compression

All in a single pass, fully streamed.

Pros: