Thinking Asynchronously in CoffeeScript/JavaScript: Loops and Callbacks

Awhile back, I wrote about my new experience in learning Node.js: A Node.js Experiment: Thinking Asynchronously, Using Recursion to Calculate the Total File Size in a Directory.

Consider this snippet of code:

var names = ['JP', 'Chris', 'Leslie'];
for (var i = 0; i < names.length; ++i){
  var name = names[i];
  setTimeout(function(){
    alert(name);              
  },10);
}​

Equivalent CoffeeScript:

names = ['JP', 'Chris', 'Leslie']
for name in names
  setTimeout(->
    alert(name)
  ,10)

Click here to run it.

If you guessed that the loop would alert “Leslie” three times, then you’d be correct.

The problem is, that before the callback executes, the loop has completed. Thus callback always has the last value.

How do you solve this problem? You wrap the callback in a closure that executes immediately.

JavaScript:

var names = ['JP', 'Chris', 'Leslie'];
for (var i = 0; i < names.length; ++i){
  var name = names[i];
  (function(name){
    setTimeout(function(){
      alert(name);              
    },10);
  })(name);
}​

CoffeeScript:

names = ['JP', 'Chris', 'Leslie']
for name in names
  do (name) ->
    setTimeout(->
      alert(name)
    ,10)

Click here to run it.

These solutions execute the block of code in a parallel manner. Using the alert’s are not a good indication in showing this behavior. However, if you were opening files, all of them would be opened approximately (not exactly) at the same time.

What if you wanted to perform the action in the callback in a serial manner?

Using the previous simple example, it’d look like this:

JavaScript:

var names = ['JP', 'Chris', 'Leslie'];
loop = function(i){
    setTimeout(function(){
      alert(names[i]);
      if (i < names.length - 1)
        loop(i + 1);       
    },10);
}
loop(0);

CoffeeScript:

names = ['JP', 'Chris', 'Leslie'];
doloop = (i) ->
  setTimeout(->
    alert(names[i])
    if i < names.length - 1
      doloop(i + 1)       
  ,10);
doloop(0)

Run it.

If you were doing file processing in the loop, it would be executed sequentially.

Hopefully this helps you to better understand asynchronous design of algorithms in JavaScript.

Update:
I forgot about the forEach function that exists in Node.js and most modern browsers. This function pretty much solves the problem.

Here’s the JavaScript code:

var names = ['JP', 'Chris', 'Leslie'];
names.forEach(function(name){
  setTimeout(function(){
    alert(name);              
  },10);
}​);

Much cleaner. Thanks to smog_alado [Reddit] for the reminder.

Checkout Gitpilot, a different kind of Git GUI.

Follow me on Twitter: @jprichardson

-JP

Why Do All the Great Node.js Developers Hate CoffeeScript?

Why do all the great Node.js developers hate CoffeeScript?

Take a look at the following Github repositories of the well-known Node.js developers:

Did you look at them? Not one of them has a project (that isn’t forked) that is written in CoffeeScript. So does the absence of CoffeeScript on Github imply these developers hate it? Absolutely not. Listen to episode 18 or 19 of Nodeup (don’t remember which one) but there are a couple of instances where they (expert Node.js devs) joke and laugh about writing in CoffeeScript. If this offensive? Of course not. But the attitude is curious to me.

One of the aforementioned developers said the following about a technology:

What if we could omit braces? How about semi-colons?

Sounds like the developer is talking about CoffeeScript, doesn’t it? No, it was TJ Holowaychuk describing Stylus, his CSS replacement language. Look at Stylus, look how CoffeeScript-esque it is. This is the the same TJ that doesn’t like CoffeeScript. This is meant to be partially tongue & cheek, but it does lend credance to my point.

Can you guess what the second most depended-upon package is on NPM? If you guessed CoffeeScript, you’d be right!

So if it’s the second most depended-upon package, it must be in use by us mere-mortal developers. Having defected from Rails, I love CoffeeScript. But, I ask again, why do the greats have a haughty attitude towards CoffeeScript? This isn’t meant to be a crusade trying to get people to convert to the holier-than-though CoffeeScript, but a genuine lack of understanding of why the disdain exists. Especially given the acceptance towards Haml, SASS, SCSS, Jade, etc. I mean, when it comes down to it, write in whatever makes you happy, but I feel like I’m missing something. If you’re part of the Node.js community, you’ll know what I’m talking about.

Looking over the CoffeeScript page, I think that you can safely conclude that in general, you’ll write less lines of code using CoffeeScript. Code is our enemy so that’s a good thing.

What do you think about CoffeeScript? Why do you think these developers don’t like CoffeeScript?

More fun CoffeeScript hatred:

If you use Git with others, you should checkout Gitpilot to make collaboration with Git simple. We would love your advice.

If you made it this far, follow me on Twitter: @jprichardson

-JP

Comparing Two Javascript Objects

Recently, I was faced with a problem where I needed to compare two Javascript objects. My initial strategy was to convert them to JSON and compare the JSON strings.

Sort of like this:

var a = JSON.stringify(person1);//'{"firstName":"JP","lastName":"Richardson"}'
var b = JSON.stringify(person2);//'{"firstName":"JP","lastName":"Richardson"}'

assert(a === b);

Simple enough, right?

Not so fast. I encountered a case like this:

var a = JSON.stringify(person1);//'{"firstName":"JP","lastName":"Richardson"}'
var b = JSON.stringify(person2);//'{"lastName":"Richardson","firstName":"JP"}'

assert(a === b);

The data is the same, but the string is different. Fortunately, Stackoverflow had a nice Javascript object comparison algorithm to dump into my app.

Object.prototype.equals = function(x)
{
  var p;
  for(p in this) {
      if(typeof(x[p])=='undefined') {return false;}
  }

  for(p in this) {
      if (this[p]) {
          switch(typeof(this[p])) {
              case 'object':
                  if (!this[p].equals(x[p])) { return false; } break;
              case 'function':
                  if (typeof(x[p])=='undefined' ||
                      (p != 'equals' && this[p].toString() != x[p].toString()))
                      return false;
                  break;
              default:
                  if (this[p] != x[p]) { return false; }
          }
      } else {
          if (x[p])
              return false;
      }
  }

  for(p in x) {
      if(typeof(this[p])=='undefined') {return false;}
  }

  return true;
}

Test passed. I eventually hit a situation where I had some code with an Object that had a Person prototype and some data that came from JSON. Kinda like this:

var person1 = new Person('JP', 'Richardson');
var person2 = JSON.parse('{"firstName":"JP","lastName":"Richardson"}');

//deepEquals is code snippet above ^
person1.deepEquals(person2); // <--- THIS FAILS

I only cared about comparing the data. The methods associated with the object (Prototype) didn’t matter. Let’s modify the above algorithm. I use CoffeeScript. Here’s the modification:

Object::jsonEquals = (x) ->
  #we do this because two objects may have the same data fields and data but different prototypes
  x1 = JSON.parse(JSON.stringify(this))
  x2 = JSON.parse(JSON.stringify(x))

  p = null
  for p of x1
    return false if typeof (x2[p]) is 'undefined'
  for p of x1
    if x1[p]
      switch typeof (x1[p])
        when 'object'
          return false unless x1[p].jsonEquals(x2[p])
        when 'function'
          return false if typeof (x2[p]) is 'undefined' or (p isnt 'equals' and x1[p].toString() isnt x2[p].toString())
        else
          return false  unless x1[p] is x2[p]
    else
      return false if x2[p]
  for p of x2
    return false if typeof (x1[p]) is 'undefined'
  true

This causes the situation like I described above to pass. Essentially convert to JSON to remove the prototype. I suppose you could make this more efficient my just manually setting the prototype to Object before doing the comparison, but oh well this works for the time being.

Do you use Git? If so, checkout Gitpilot to make project management and collaborating on projects seamless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

Synchronous File Copy in Node.js

Sometimes, asynchronous operations can be a burden. Especially when you’re writing small console utilities like to batch process files.

There are many asynchronous ways to copy a file. Here is a synchronous version (CoffeeScript):

copyFileSync = (srcFile, destFile) ->
  BUF_LENGTH = 64*1024
  buff = new Buffer(BUF_LENGTH)
  fdr = fs.openSync(srcFile, 'r')
  fdw = fs.openSync(destFile, 'w')
  bytesRead = 1
  pos = 0
  while bytesRead > 0
    bytesRead = fs.readSync(fdr, buff, 0, BUF_LENGTH, pos)
    fs.writeSync(fdw,buff,0,bytesRead)
    pos += bytesRead
  fs.closeSync(fdr)
  fs.closeSync(fdw)

You can view the converted version in JavaScript.

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

A Node.js Experiment: Thinking Asynchronously, Using Recursion to Calculate the Total File Size in a Directory

I recently picked up Node.js/CoffeeScript; I figured that since JavaScript can run on about every modern computing device, it’s about time that I accept JavaScript instead of side-stepping it by using dying technologies such as GWT and Silverlight.

I’ve always felt that the best way to learn a new language/platform is to start by writing a simple program that solves a simple problem.

My problem involved traversing the filesystem and performing some tasks. For the sake of this blog post and for the sake of your attention span, the problem can be reduced to a simple algorithm that computes the total space that a directory and its contents use.

Let’s start by creating a simple synchronous version:

fs = require('fs')
path = require('path')

du = (dir) ->
  total = 0
  try 
    stat = fs.lstatSync(dir)
    if stat.isFile()
      total += stat.size
    else if stat.isDirectory()
      files = fs.readdirSync(dir)
      for file in files
        total += du(path.join(dir, file))
  catch e
  
  total

DIR = '/'
total_bytes = du(DIR)
total_kb = total_bytes / 1024.0
total_mb = total_kb / 1024.0

console.log("#{DIR}: #{total_mb.toFixed(3)} MB")

This code works fine and as expected. It displays the total size of your entire directory in MiB. Ya, I know, I wrote “MB”.

But… we are using Node.js here. The asynchronous nature should be embraced. Let’s rewrite this algorithm in an asynchronous form.

fs = require('fs')
path = require('path')

duAsync = (dir, cb) ->
  total = 0
  fs.lstat dir, (err, stat) ->
    if err then return
    if stat.isFile()
      total += stat.size
    else if stat.isDirectory()
      fs.readdir dir, (err, files) ->
        if err then return
        for file in files
          duAsync path.join(dir,file), cb
    cb(null,total)

DIR = '/'
duAsync DIR, (err, total_bytes) ->
  total_kb = total_bytes / 1000.0
  total_mb = total_kb / 1000.0

  console.log("#{DIR}: #{total_mb.toFixed(3)} MB")

Hmm, this doesn’t output the correct values. I’m not passing the totals up the callback chain.

Also, from here on out, I’m only going to show the algorithm.

Let’s take advantage of closures and modify this a bit. If we could remove the recursion, that may simplify things a bit.

duAsync2 = (dir,cb) ->
  total = 0
  files = []
  all_files.push(dir)

  while all_files.length > 0
    current_dir = files.pop
    fs.lstat current_dir, (err,stat) ->
      if err then return
      if stat.isFile()
        total += stat.size
      else if stat.isDirectory()
        fs.readdir current_dir, (err,files) ->
          if err then return
          for file in files
            all_files.push(path.join(current_dir, file))
      cb(null,total)

On the surface, this looks fairly simple. We have removed the recursive aspect to simplify it a bit. The code in the while block will always see ‘total’ so we don’t run into the same problem as the last implementation.

One major problem though, this doesn’t work. This exits almost right away. Ah yes… we are doing an asynchronous implementation. The all_files array is empty by the time the while loop goes to the second iteration.

Maybe recursion is unavoidable? Let’s still leverage closures though.

This version is very similar to the last, I’ve just managed to use recursion within a function. The ‘again’ function is called recursively.

duAsync3 = (dir,cb) ->
  total = 0

  again = (current_dir) ->
    fs.lstat current_dir, (err, stat) ->
      if err then return
      if stat.isFile()
        total += stat.size
      else if stat.isDirectory()
        fs.readdir current_dir, (err,files) ->
          if err then return
          for file in files
            again(path.join(current_dir, file))
      cb(null, total)

  again(dir)

It works! Consider this: what if you only want the results at the very end? That is, you only want the callback to occur once, and at the end… then what do you do?

This was a dilemma that I faced for a bit. For this particular problem, it might not really matter much. Especially considering that this is a console utility. However, I considered figuring this out, a right of passage as a Node.js/JavaScript noob. So I didn’t want to use any utilities such as Async.js, Seq, etc.

I started doing research, fortunately I stumbled upon two great articles:

  1. “Asynchronous JavaScript: The Tale of Harry”
  2. Currying the Callback the Essence of Futures

The first article seemed to have almost an identical problem. Except, that the author didn’t impose the additional constraint of only executing the callback upon the finished. The solution in that article works as expected, but seems a bit more complex than necessary.

I kept researching. Found an article Deriving the Y-Combinator in 7 Easy Steps (JavaScript). My mind was exploding learning some of these functional programming concepts!

But, I still wasn’t closer to a solution. I finally made my way into #node.js on freenode (IRC). Fortunately, AvianFlu was able to lend me a tip. He suggested the following:

  • Create three variables: started, finished, running
  • At the beginning of the callback, increment started and running.
  • At the end, decrement running and increment finished.
  • When (started === finished) && (running === 0) You should be done.

I experimented with this for awhile. Sometimes, it felt that I was close. But it never quite worked. Then I thought about it a bit more and kept the concept of a ‘running’ variable and added a variable to denote the number of files left to process.

duAsync4 = (dir,cb) ->
  total = 0
  file_counter = 1 #starts at one because of the initial directory
  async_running = 0

  again = (current_dir) ->
    fs.lstat current_dir, (err, stat) ->
      if err then file_counter--; return
      if stat.isFile()
        file_counter--
        total += stat.size
      else if stat.isDirectory()
        file_counter--
        async_running++
        fs.readdir current_dir, (err,files) ->
          async_running--
          if err then return #console.log err.message
          file_counter += files.length
          for file in files
            again path.join(current_dir, file)
      else
        file_counter--
      if file_counter is 0 and async_running is 0
        cb(null, total)

  again dir

This works. What’s important to note is that there are many ways to solve problems using Node.js. On my Quad-Core MBP 8 GB Ram, this is almost twice as fast as the synchronous version!

Try it out and let me know your results. Also, can you think of any other ways to solve this problem?

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

Follow

Get every new post delivered to your Inbox.