A Node.js Experiment: Thinking Asynchronously, Using Recursion to Calculate the Total File Size in a Directory

I recently picked up Node.js/CoffeeScript; I figured that since JavaScript can run on about every modern computing device, it’s about time that I accept JavaScript instead of side-stepping it by using dying technologies such as GWT and Silverlight.

I’ve always felt that the best way to learn a new language/platform is to start by writing a simple program that solves a simple problem.

My problem involved traversing the filesystem and performing some tasks. For the sake of this blog post and for the sake of your attention span, the problem can be reduced to a simple algorithm that computes the total space that a directory and its contents use.

Let’s start by creating a simple synchronous version:

fs = require('fs')
path = require('path')

du = (dir) ->
  total = 0
  try 
    stat = fs.lstatSync(dir)
    if stat.isFile()
      total += stat.size
    else if stat.isDirectory()
      files = fs.readdirSync(dir)
      for file in files
        total += du(path.join(dir, file))
  catch e
  
  total

DIR = '/'
total_bytes = du(DIR)
total_kb = total_bytes / 1024.0
total_mb = total_kb / 1024.0

console.log("#{DIR}: #{total_mb.toFixed(3)} MB")

This code works fine and as expected. It displays the total size of your entire directory in MiB. Ya, I know, I wrote “MB”.

But… we are using Node.js here. The asynchronous nature should be embraced. Let’s rewrite this algorithm in an asynchronous form.

fs = require('fs')
path = require('path')

duAsync = (dir, cb) ->
  total = 0
  fs.lstat dir, (err, stat) ->
    if err then return
    if stat.isFile()
      total += stat.size
    else if stat.isDirectory()
      fs.readdir dir, (err, files) ->
        if err then return
        for file in files
          duAsync path.join(dir,file), cb
    cb(null,total)

DIR = '/'
duAsync DIR, (err, total_bytes) ->
  total_kb = total_bytes / 1000.0
  total_mb = total_kb / 1000.0

  console.log("#{DIR}: #{total_mb.toFixed(3)} MB")

Hmm, this doesn’t output the correct values. I’m not passing the totals up the callback chain.

Also, from here on out, I’m only going to show the algorithm.

Let’s take advantage of closures and modify this a bit. If we could remove the recursion, that may simplify things a bit.

duAsync2 = (dir,cb) ->
  total = 0
  files = []
  all_files.push(dir)

  while all_files.length > 0
    current_dir = files.pop
    fs.lstat current_dir, (err,stat) ->
      if err then return
      if stat.isFile()
        total += stat.size
      else if stat.isDirectory()
        fs.readdir current_dir, (err,files) ->
          if err then return
          for file in files
            all_files.push(path.join(current_dir, file))
      cb(null,total)

On the surface, this looks fairly simple. We have removed the recursive aspect to simplify it a bit. The code in the while block will always see ‘total’ so we don’t run into the same problem as the last implementation.

One major problem though, this doesn’t work. This exits almost right away. Ah yes… we are doing an asynchronous implementation. The all_files array is empty by the time the while loop goes to the second iteration.

Maybe recursion is unavoidable? Let’s still leverage closures though.

This version is very similar to the last, I’ve just managed to use recursion within a function. The ‘again’ function is called recursively.

duAsync3 = (dir,cb) ->
  total = 0

  again = (current_dir) ->
    fs.lstat current_dir, (err, stat) ->
      if err then return
      if stat.isFile()
        total += stat.size
      else if stat.isDirectory()
        fs.readdir current_dir, (err,files) ->
          if err then return
          for file in files
            again(path.join(current_dir, file))
      cb(null, total)

  again(dir)

It works! Consider this: what if you only want the results at the very end? That is, you only want the callback to occur once, and at the end… then what do you do?

This was a dilemma that I faced for a bit. For this particular problem, it might not really matter much. Especially considering that this is a console utility. However, I considered figuring this out, a right of passage as a Node.js/JavaScript noob. So I didn’t want to use any utilities such as Async.js, Seq, etc.

I started doing research, fortunately I stumbled upon two great articles:

  1. “Asynchronous JavaScript: The Tale of Harry”
  2. Currying the Callback the Essence of Futures

The first article seemed to have almost an identical problem. Except, that the author didn’t impose the additional constraint of only executing the callback upon the finished. The solution in that article works as expected, but seems a bit more complex than necessary.

I kept researching. Found an article Deriving the Y-Combinator in 7 Easy Steps (JavaScript). My mind was exploding learning some of these functional programming concepts!

But, I still wasn’t closer to a solution. I finally made my way into #node.js on freenode (IRC). Fortunately, AvianFlu was able to lend me a tip. He suggested the following:

  • Create three variables: started, finished, running
  • At the beginning of the callback, increment started and running.
  • At the end, decrement running and increment finished.
  • When (started === finished) && (running === 0) You should be done.

I experimented with this for awhile. Sometimes, it felt that I was close. But it never quite worked. Then I thought about it a bit more and kept the concept of a ‘running’ variable and added a variable to denote the number of files left to process.

duAsync4 = (dir,cb) ->
  total = 0
  file_counter = 1 #starts at one because of the initial directory
  async_running = 0

  again = (current_dir) ->
    fs.lstat current_dir, (err, stat) ->
      if err then file_counter--; return
      if stat.isFile()
        file_counter--
        total += stat.size
      else if stat.isDirectory()
        file_counter--
        async_running++
        fs.readdir current_dir, (err,files) ->
          async_running--
          if err then return #console.log err.message
          file_counter += files.length
          for file in files
            again path.join(current_dir, file)
      else
        file_counter--
      if file_counter is 0 and async_running is 0
        cb(null, total)

  again dir

This works. What’s important to note is that there are many ways to solve problems using Node.js. On my Quad-Core MBP 8 GB Ram, this is almost twice as fast as the synchronous version!

Try it out and let me know your results. Also, can you think of any other ways to solve this problem?

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

A for-loop conversion from JavaScript to CoffeeScript

I was stumbling along StackOverflow when I ran into this question:

How can I convert a Javascript for-loop to Coffee? With this example:

for (i = 0; i < 10; i++) {
    doStuff();
}

The answer: (I’ve since edited and updated the Stackoverflow answer)

doStuff() for i in [0 .. 9]

http://jashkenas.github.com/coffee-script/#loops

This answer works for this contrived case.

What happens when you have a case like this:

for (var x = 0; x < myArray.length; ++x)
  alert(x);

What’s the answer smarty pants?
If you answered:

alert(i) for i in [0 .. myArray.length-1]

You get a pass if myArray has anything in it. What will happen if myArray is an empty (not null) array?
Go ahead and execute it. Go here to the CoffeeScript page, and paste this code in the “Try CoffeeScript”, I’ll wait.

myArray = []
alert(i) for i in [0 .. myArray.length-1]

What’s that? You say it returned “0″ and “-1″???

The secret is in the ".." which implies "=>" or "<=" 
and the "..." implies "&lt" or "&gt".

So this is probably what you expect:

myArray = []
alert(i) for i in [0 ... myArray.length]

It’s too bad the loops section at this time doesn’t mention this.

CoffeeScript is a great tool. Here is another tool that allows you to convert JavaScript to CoffeeScript. Try our examples here and see what the tool generates. Hint: it isn’t always optimized.

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

Modifying $NODE_PATH for Node.js/NPM/NVM

I’ve been hacking around with Node.js lately. If you develop with Node.js then you should be using both NPM (package manager) and NVM (node version manager, like RVM).

For some reason, NVM doesn’t seem to set $NODE_PATH when you switch Node versions. $NODE_PATH needs to be set so that when you call ‘require’ from within your Node programs, it can find the relevant modules.

If you don’t set $NODE_PATH, that’s OK too. But you’ll find that in your project directory you’ll need to do a lot of linking. You can do this like so:

npm install -g express #the '-g' flag installs the module to the npm node_module directory
npm link express

Doing this is OK, but is a bit annoying at times because some modules still won’t work if you link with them, so you’re left installing it locally to your project’s directory.

I hacked up some bash commands to set my Node environment how I like it. This also allows me to make my own Node modules.

My ~/.bash_profile on Mac OS X Lion

[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm" # Load RVM function
PROJ=~/Dropbox/Projects
. ~/.nvm/nvm.sh

JP_NODE_PATH=/Users/jprichardson/Dropbox/Projects/Personal/js/node_modules
JP_NODE_BIN_PATH="${JP_NODE_PATH}/.bin"

NP=$(which node) 
BP=${NP%bin/node} #this replaces the string '/bin/node'
LP="${BP}lib/node_modules"

export PATH="$PATH:$JP_NODE_BIN_PATH"
export NODE_PATH="$JP_NODE_PATH:$LP"

When I run `which node` this is capturing the output from my default Node version. You set this like:
nvm alias default 0.4 which will use the latest 0.4, which is 0.4.12.

Again, I’m not a bash hacker by any means. I didn’t even know why you need to use ‘export’. I didn’t know how to concatenate strings in a bash variable. Or, even how to run a command and capture the output in a variable. I also didn’t know how to modify strings in bash. What kind of hacker am I? ::sobs:: Hehehe.

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

Automating the Generation of iOS Push Notification Certificates

I have a large number (200+) of iOS applications. I needed to generate push notification certificates for each of them. If you’ve ever gone through the process, it can be a huge pain to do just a few. I needed to write a script
to automate this process.

First, I created a ruby gem to access the Mac OS X Keychain App.

Next, I leveraged watir to actually login to the iOS provisioning portal and upload/download the necessary files.

First, make sure that you’re using Mac OS X 10.7 (Lion) and that you have Chrome and Ruby 1.9.2 installed.

  1. Download chromedriver http://code.google.com/p/chromium/downloads/list
  2. Move the binary to /usr/local

Next, run these commands:

git clone git://github.com/jprichardson/GeneratePushCerts.git
gem install 'keychain_manager'
gem install 'watir-webdriver'
cd ./GeneratePushCerts

Modify the ‘config.example.yml’ file with your iOS developer/provisioning portal credentials. Rename the file
to ‘config.yml’.

You’ll want to edit ‘app.rb’ and modify the END_WITH variable and set it to ”. I should have created a regular expression around line 60.

Run the app ‘ruby app.rb’.

Let’s take a look at some watir code from the file:

  browser.checkbox(id: 'enablePush').click() #enable configure buttons
  browser.button(id: 'aps-assistant-btn-prod-en').click() #configure button

  Watir::Wait.until { browser.body.text.include?('Generate a Certificate Signing Request') }

  browser.button(id: 'ext-gen59').click() #on lightbox overlay, click continue

  Watir::Wait.until { browser.body.text.include?('Submit Certificate Signing Request') }

  browser.file_field(name: 'upload').set(CERT_REQUEST_FILE)
  browser.execute_script("callFileValidate();")
  #browser.file_field(name: 'upload').click() #calls some local javascript to validate the file and enable continue button, unfortunately File Browse dialog shows up

  browser.button(id: 'ext-gen75').click()

  Watir::Wait.until(WAIT_TO) { browser.body.text.include?('Your APNs SSL Certificate has been generated.') }

  browser.button(id: 'ext-gen59').click() #continue

  Watir::Wait.until { browser.body.text.include?('Step 1: Download') }

  File.delete(DOWNLOADED_CERT_FILE) if File.exists?(DOWNLOADED_CERT_FILE)

  browser.button(alt: 'Download').click() #download cert

  puts('Checking for existence of downloaded certificate file...')
  while !File.exists?(DOWNLOADED_CERT_FILE)
    sleep 1
  end

  Watir::Wait.until { browser.body.text.include?("Download & Install Your Apple Push Notification service SSL Certificate") }

  browser.button(id: 'ext-gen91').click()
  browser.goto(APP_IDS_URL)

You can browse the entire source code on Github.

I think watir is pretty intuitive and is a swiss army knife for automating the interaction with web pages. It’s pretty slow as it’s meant for testing, so I wouldn’t use it for screen scraping, but it’s served me well for this task.

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

Automating the Mac OS X Keychain App with Ruby

Recently, I needed a way to automate the generation of 100+ Apple Push notification certificates for my iOS development. So, I created a Ruby Gem [keychain_manager] to automate the Mac OS X Keychain application.

Here is how you can use it:

gem install keychain_manager

require 'keychain_manager'

USER = 'jprichardson@gmail.com'
KEYCHAIN = 'apple_push_keychain' #this can be anything, we just don't want to pollute the 'login' keychain
YOUR_DOWNLOADS_DIR = '' # you must set this, this is where the file aps_production_identity.cer exists

RSA_FILE = '/tmp/myrsa.key'
KeychainManager.generate_rsa_key(RSA_FILE)

CERT_FILE = '/tmp/CertificateSigningRequest.certSigningRequest'
KeychainManager.generate_cert_request(USER, 'US', CERT_FILE) #'US' is the country abbreviation.

kcm = KeychainManager.new(KEYCHAIN)
kcm.delete if kcm.exists?
kcm.create

kcm.import_rsa_key(RSA_FILE)

#now from your browser, you'll have downloaded a file from Apple typically named: aps_production_identity.cer
kcm.import_apple_cert(File.join(YOUR_DOWNLOADS_DIR, '/aps_production_identity.cer'))

P12_FILE = '/tmp/push_prod.p12'
kcm.export_identites(P12_FILE)

PEM_FILE = '/tmp/push_prod.pem'
KeychainManager.convert_p12_to_pem(P12_FILE, PEM_FILE)

kcm.delete

#Now upload the PEM_FILE to your server.

This gem could easily be modified to support other Keychain functions. Browse the sourcecode here: Keychain Manager source code.

I’ll post soon on how I automated the web portion of communicating with the iOS Provisioning Portal.

Do you use Git? If so, checkout Gitpilot to make using Git thoughtless.

Follow me on Twitter: @jprichardson and read my blog on entrepreneurship: Techneur.

-JP Richardson

Using Mongoid with Rspec

I’m a fan of testing. I’m still trying to discipline myself to test before I write code, that’s been a tough habit to develop, however testing after I develop a feature is a good habit that has paid dividends. Lately I’ve started learning how to use Rspec for Rails.

Out of the box, Rspec is configured to use ActiveRecord. I’ve pretty much stopped using relational databases in favor of NoSql solutions. My favorite NoSql DB is MongoDB. The Ruby MongoDB ORM that I’ve been using is Mongoid.

To prep for Rspec, make your gemfile look like this (Assuming Rails 3.x):

gem 'rspec-rails', :group => [:test, :development]
group :test do
  gem 'database_cleaner'
end

Then run:

bundle install
rails g rspec:install

When I started learning how to use Rspec, I started having problems. The first was the following error message:

undefined method `fixture_path=' for # (NoMethodError)

Oops, just needed to comment out the line spec_helper.rb:

config.fixture_path = "#{::Rails.root}/spec/fixtures"

Was I safe yet? Nope. Another error:
undefined method `use_transactional_fixtures=' for # (NoMethodError)

Oops, just needed to comment out the line spec_helper.rb:

config.use_transactional_fixtures = true

I should have known as the there are comments in the file.

Now, when I run my specs, the changes to the database persist from one test to the next. Ideally, you want a clean database when you start each test. This is where the gem database_cleaner comes in handy.

Then add this to your spec_helper.rb file:

  config.before(:suite) do
    DatabaseCleaner[:mongoid].strategy = :truncation
  end

  config.before(:each) do
    DatabaseCleaner[:mongoid].start
  end

  config.after(:each) do
    DatabaseCleaner[:mongoid].clean
  end

That’s it!

Are you a Git user? Let me help you make project management with Git simple. Checkout Gitpilot.

If you made it this far, read my blog on software entrepreneurship and follow me on Twitter: @jprichardson.

FridayThe13th the Best JSON Parser for Silverlight and C#/.NET

Up until a couple of months ago I was writing most of my code using WPF. Recently, a project came up where Silverlight made more sense to use. I’d thought that wouldn’t be a problem since I’d just use JavaScriptSerializer [wrote about it here] like I did for my WPF project.

Uh oh. It turns out that Silverlight doesn’t have JavaScriptSerializer. Never fear! DataContractJsonSerializer is here! Or so I thought.

It turns out that if you want to use DataContractJsonSerializer you must actually create POCOs to backup this “data contract.” I didn’t want to do that.

I wanted to turn this…

{
	"some_number": 108.541,
	"date_time": "2011-04-13T15:34:09Z",
	"serial_number": "SN1234"
	"more_data": {
		"field1": 1.0
		"field2": "hello"
	}
}

into..

using System.Web.Script.Serialization;

var jss = new JavaScriptSerializer();
var dict = jss.Deserialize<dynamic>(jsonText);

Console.WriteLine(dict["some_number"]); //outputs 108.541
Console.WriteLine(dict["more_data"]["field2"]); //outputs hello

So I set out to write my own JSON parser. I call it FridayThe13th… how fitting huh? Now, using either Silverlight or .NET 4.0, you can parse the previous JSON into the following:

using FridayThe13th;

var jsonText = File.ReadAllText("mydata.json");

var jsp = new JsonParser(){CamelizeProperties = true};
dynamic json = jsp.Parse(jsonText);

Console.WriteLine(json.SomeNumber); //outputs 108.541
Console.WriteLine(json.MoreData.Field2); //outputs hello

Since I work with a lot of Ruby on Rails backends, I want to add a property “CamelizeProperties” to turn “some_number” into “SomeNumber”… it’s more .NET like.

Try it! You can find it on Github. Oh yeah… it’s also faster than that other .NET JSON library that everyone uses.

Are you a Git user? Let me help you make project management with Git simple. Checkout Gitpilot.

If you made it this far, read my blog on software entrepreneurship and follow me on Twitter: @jprichardson.

-JP

Printing to a Postscript File from Delphi/C++ Builder

Unfortunately for me, I had to patch some legacy C++ Builder code to output the reports that were sent to a printer to a PDF file. Fortunately, many printers support standard Postscript and converting to a PDF from a Postscript is trivial.

Here are the instructions, assuming Windows XP:
1)Go to Printers/Faxes and Add New Printer
2)When the wizard pops up click “Next”
3)You should be on the screen “Add Printer Wizard”… select “Local Printer Attached”, uncheck “Automatically detect…”
4)Use port “LPT1”
5)Select “Apple” for manufacturer and printers select “Apple LaserWriter 12/640 PS”
6)In printer name I put “Apple PS”
7)For default, you can select “No”
8)No on sharing/or printing test page.

Now we have a Postscript printer installed.

Here is some sample code:

void __fastcall TForm2::Button1Click(TObject *Sender)
{
    TDocInfo di;
 
    int i = Printer()->Printers->IndexOf("Apple PS");
    Printer()->PrinterIndex = i;
 
    Printer()->BeginDoc();
    EndPage(Printer()->Canvas->Handle);
    AbortDoc(Printer()->Canvas->Handle);
 
    memset(&di, sizeof(di), 0);
    di.cbSize = sizeof(di);
    di.lpszDocName = "TEST";
    di.lpszOutput = "C:\\somefile.ps";
    StartDoc(Printer()->Canvas->Handle, &di);
    StartPage(Printer()->Canvas->Handle);
 
    Printer()->Canvas->TextOutA(0,0, "Testing Printer");
    Printer()->EndDoc();
}

Are you a Git user? Let me help you make project management with Git simple. Checkout Gitpilot.

Follow me on Twitter: @jprichardson and read my blog on software entrepreneurship: Techneur

-JP

Disable Sprockets for Rails in Development Mode

Sprockets (is/are)? the technology in Rails 3.1 and up that combines your CSS and Javascript files into one CSS and Javascript file. It remains to be seen if this was a positive change or not. The argument is that the client only has to make one HTTP request instead of many for each file; this is a good thing. However, if you change one line in one of your Javascript files, the client then has to redownload the entire combined file instead of just the one Javascript file that you modified.

Regardless, you may want to disable Sprockets while you’re developing your Rails application. This will aid in debugging.

Change…

<%= stylesheet_include_tag "application" %>
<%= javascript_include_tag "application" %>

to…

<%= stylesheet_include_tag "application", debug: Rails.env.development? %>
<%= javascript_include_tag "application", debug: Rails.env.development? %>

Are you a Git user? Let me help you make project management with Git simple. Checkout Gitpilot.

Follow me on Twitter: @jprichardson and read my blog on software entrepreneurship: Techneur

-JP

Missing DockPanel? Add DockPanel for Silverlight 4 or Silverlight 5

When I first created a demo project for Silverlight 5, I opened the XAML code to start editing. I started typing “DockPanel” but then I noticed that intellisense didn’t show anything. It turns out that DockPanel and some other controls aren’t included in the default installation of Silverlight 4 or Silverlight 5 beta.

Here’s what you need to do:

  1. Download the Silverlight 4 Toolkit. Install it. (Yes, this will work with Silverlight 5 beta)
  2. Add a reference to “System.Windows.Controls.Toolkit”. In Silverlight 5, you will need to navigate to the file:
    %ProgramFiles%\Microsoft SDKs\Silverlight\v4.0\Toolkit\Apr10\Bin\System.Windows.Controls.Toolkit.dll
  3. Add the following attribute to your UserControl: xmlns:tk=”clr-namespace:System.Windows.Controls;assembly=System.Windows.Controls.Toolkit”
  4. So your code might look like:

    <UserControl x:Class="Project1.MainPage"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    	xmlns:tk="clr-namespace:System.Windows.Controls;assembly=System.Windows.Controls.Toolkit"
        mc:Ignorable="d"
        d:DesignHeight="300" d:DesignWidth="400">
    
        <Grid x:Name="LayoutRoot" Background="White">
    		<tk:DockPanel>
    			
    		</tk:DockPanel>
        </Grid>
    	
    </UserControl>
    

    Enjoy.

    Advertisement:
    Make Git simple. Let Gitpilot show you the right way to Git things done. Gitpilot will help you write software faster.

    Follow me on Twitter: @jprichardson and read my blog on software entrepreneurship: Techneur

    -JP

Follow

Get every new post delivered to your Inbox.