Month: April 2014

Node Background Processing Awesomeness

Unless I’m misunderstanding something, a Node.js callback can replace a full system architecture of handling asynch background processing.

Suppose you have a webapp route that initiates a heavy background task: POST ‘/do_heavy_lifting’

You don’t want your users to wait forever while you process; that’s sucky UX and blocks your app’s precious resources. So, you might drop an event down to some messaging queue (say, Rabbit), and write up a worker to listen to that queue and then perform the heavy lifting in the background. Now you’ve got to maintain the worker, the queue, the event, and synchronize all the different parts together… This is only an issue once you reach a certain scale, but then — it’s an issue, once you reach a certain scale.

However, there is a better way.

As you surely know, you can use Node.js’s callback awesomeness to perform heavy background processing (codename “I/O”) without blocking the code execution flow. However, if you respond to the request without waiting for the I/O to finish, you can even respond to the user – thus freeing the connection – while allowing the background processing to finish in its own time!

This allows you to handle arbitrarily heavy background tasks while immediately freeing the response, both clearing the HTTP connection and responding immediately to the client. Holy smokes!

Sample code, using writing content to a file as an example of heavy lifting (observe the response.end outside the callback):

// Require what we need
var http = require('http');
var fs = require('fs');

function now(){
    var d = new Date(); 
    return d.getHours()+d.getMinutes()+':'+d.getSeconds()+'.'+d.getMilliseconds();
}

var app = http.createServer(function(request, response) {
  response.writeHead(200, { 'Content-Type': 'text/plain'});

  fs.writeFile('baz', 'baz contents: '+now(), function(err) {
    console.log('created file. Now running arbitary background code');            
    console.log('before Second Reponse:'+now());
    response.end('Redundant response: '+now()+'\n'); 
  }); 

  console.log('before First Response :'+now());
  response.end('First Response : '+now()+'\n');
});

app.listen(1337, 'localhost');
console.log('Server running at http://localhost:1337/');

Copy the above into foo.js, $ node foo.js, and (separately) $ curl localhost:1337. Notice:

  1. You will always get ‘First Response’, which is expected.
  2. The ‘redundant response’ is never used. In fact we can comment it out; same behavior. (It’s redundant in our case because it’s a background operation, and we don’t need it to finish in order to respond to the user.)
  3. The I/O operation does get executed, as well as its callback (which can be arbitrarily extended).
  4. Under ‘arbitrary background code’, we can now execute whatever we want — the client has already receiving its response.

You have just implemented a full-blown background asynch processing paradigm, with about a line and a half of code and some Node.js-Fu.

Advertisements