Socket.io & nodejs: at a medium pace

If you enjoyed this article, please leave a comment, rss subscribe to my RSS feed and/or follow me on Twitter. Thank you very much!

In my last blog entry, I shared some nodejs-code to read CouchDB's _changes feed and publish the data to a website. In order to update the page in a continous fashion, I used socket.io which provides a nifty abstraction across server- to client-side transports — for example, websockets and ajax longpoll.

Full-throttle

When we tested the code for a few days over the weekend, the largest issue we ran into was that the stream moved too fast. In fact it moved so fast, we couldn't read anything and were at risk of getting a seizure when we watched the page for too long.

Certainly awesome from one point of view — people are using the website — but it also led to the next objective: I had to find a way to throttle broadcasting to the client. Here's how!

Decisions, decisions!

So in the first iteration of the project we read _changes with ?include_docs=true.

This combination makes it incredibly easy to get all data from BigCouch/CouchDB. However, when you're not looking to broadcast all data, but only a fraction of it at a set interval, it becomes pointless to add the extra burdon on the cluster which ?include_docs=true brings along.

Moving forward, we decided to not use ?include_docs=true anymore (Hat tip to Adam of Cloudant for consultation!) and instead use the IDs provided by _changes and request individual documents on demand.

The interval to broadcast data to the realtime stream for the user is set at every 500 milliseconds, which equals to more or less two requests per second.

At a medium pace

Here's the actual code — relevant bits only.

/**
 * @var options object Options for restler, including HTTP headers.
 */
var options = ...;

/**
 * @var restler object HTTP client
 */
var restler = ...;

/**
 * @var socket object socket.io
 */
var socket = ...;

/**
 * @var current null|object
 * @global
 */
var current;

/**
 * @desc Run this at an interval.
 */
var broadCaster = setInterval(function() {
    if (current === null) {
        return;
    }

    // reset Connection header
    options.headers["Connection"] = "close";

    req = restler.get("http://127.0.0.1:5984/db/" + current.id, options);
    req.on('complete', function(data, response) {
        current = {
            'title': data.title,
            'article': data.body,
            'date': data.saved
        };
        socket.broadcast(current);
        current = null;
    });
}, 500);

/* standard nodejs client code here to do a _changes request */

var json;
response.on('data', function (chunk) {
    json = JSON.parse(chunk);
    current = { 'id': json.id };
});

/* ... */

So all in all, I hope this is straight forward!

  1. We need restler and some options to do the individual requests from setInterval.
  2. We also need socket, which is a socket.io setup (see my previous blog entry).
  3. current is a global variable to hold the ID from the last chunk processed.
  4. In setInterval we request the full document from the database.
  5. current is reset after it was broadcasted.
  6. current is populated in the response from the request to _changes.

Fin

I hope this little example illustrates how to throttle your application a little.

| More