thirty-four things tagged “todo”

The Art of Node by maxogden More Pasta

A fantastic introduction to Node (for maybe someone coming in from Python or Ruby land.)

The Art of Node

An introduction to Node.js

This document is intended for readers who know at least a little bit of a couple of things:

  • a scripting language like JavaScript, Ruby, Python, Perl, etc. If you aren’t a programmer yet then it is probably easier to start by reading JavaScript for Cats. :cat2:
  • git and github. These are the open source collaboration tools that people in the node community use to share modules. You just need to know the basics. Here are three great intro tutorials: 1, 2, 3

Table of contents

Learn node interactively

In addition to reading this guide it’s super important to also bust out your favorite text editor and actually write some node code. I always find that when I just read some code in a book it never really clicks, but learning by writing code is a good way to grasp new programming concepts.

NodeSchool.io

NodeSchool.io is a series of free + open source interactive workshops that teach you the principles of Node.js and beyond.

Learn You The Node.js is the introductory NodeSchool.io workshop. It’s a set of programming problems that introduce you to common node patterns. It comes packaged as a command line program.

learnyounode

You can install it with npm:

# install
npm install learnyounode -g

# start the menu
learnyounode

Understanding node

Node.js is an open source project designed to help you write JavaScript programs that talk to networks, file systems or other I/O (input/output, reading/writing) sources. That’s it! It is just a simple and stable I/O platform that you are encouraged to build modules on top of.

What are some examples of I/O? Here is a diagram of an application that I made with node that shows many I/O sources:

server diagram

If you don’t understand all of the different things in the diagram it is completely okay. The point is to show that a single node process (the hexagon in the middle) can act as the broker between all of the different I/O endpoints (orange and purple represent I/O).

Usually building these kinds of systems is either:

  • difficult to code but yields super fast results (like writing your web servers from scratch in C)
  • easy to code but not very speedy/robust (like when someone tries to upload a 5GB file and your server crashes)

Node’s goal is to strike a balance between these two: relatively easy to understand and use and fast enough for most use cases.

Node isn’t either of the following:

  • A web framework (like Rails or Django, though it can be used to make such things)
  • A programming language (it uses JavaScript but node isn’t its own language)

Instead, node is somewhere in the middle. It is:

  • Designed to be simple and therefore relatively easy to understand and use
  • Useful for I/O based programs that need to be fast and/or handle lots of connections

At a lower level, node can be described as a tool for writing two major types of programs:

  • Network programs using the protocols of the web: HTTP, TCP, UDP, DNS and SSL
  • Programs that read and write data to the filesystem or local processes/memory

What is an “I/O based program”? Here are some common I/O sources:

  • Databases (e.g. MySQL, PostgreSQL, MongoDB, Redis, CouchDB)
  • APIs (e.g. Twitter, Facebook, Apple Push Notifications)
  • HTTP/WebSocket connections (from users of a web app)
  • Files (image resizer, video editor, internet radio)

Node does I/O in a way that is asynchronous which lets it handle lots of different things simultaneously. For example, if you go down to a fast food joint and order a cheeseburger they will immediately take your order and then make you wait around until the cheeseburger is ready. In the meantime they can take other orders and start cooking cheeseburgers for other people. Imagine if you had to wait at the register for your cheeseburger, blocking all other people in line from ordering while they cooked your burger! This is called blocking I/O because all I/O (cooking cheeseburgers) happens one at a time. Node, on the other hand, is non-blocking, which means it can cook many cheeseburgers at once.

Here are some fun things made easy with node thanks to its non-blocking nature:

Core modules

Firstly I would recommend that you get node installed on your computer. The easiest way is to visit nodejs.org and click Install.

Node has a small core group of modules (commonly referred to as ‘node core’) that are presented as the public API that you are intended to write programs with. For working with file systems there is the fs module and for networks there are modules like net (TCP), http, dgram (UDP).

In addition to fs and network modules there are a number of other base modules in node core. There is a module for asynchronously resolving DNS queries called dns, a module for getting OS specific information like the tmpdir location called os, a module for allocating binary chunks of memory called buffer, some modules for parsing urls and paths (url, querystring, path), etc. Most if not all of the modules in node core are there to support node’s main use case: writing fast programs that talk to file systems or networks.

Node handles I/O with: callbacks, events, streams and modules. If you learn how these four things work then you will be able to go into any module in node core and have a basic understanding about how to interface with it.

Callbacks

This is the most important topic to understand if you want to understand how to use node. Nearly everything in node uses callbacks. They weren’t invented by node, they are just part of the JavaScript language.

Callbacks are functions that are executed asynchronously, or at a later time. Instead of the code reading top to bottom procedurally, async programs may execute different functions at different times based on the order and speed that earlier functions like http requests or file system reads happen.

The difference can be confusing since determining if a function is asynchronous or not depends a lot on context. Here is a simple synchronous example, meaning you can read the code top to bottom just like a book:

var myNumber = 1
function addOne() { myNumber++ } // define the function
addOne() // run the function
console.log(myNumber) // logs out 2

The code here defines a function and then on the next line calls that function, without waiting for anything. When the function is called it immediately adds 1 to the number, so we can expect that after we call the function the number should be 2. This is the expectation of synchronous code - it sequentially runs top to bottom.

Node, however, uses mostly asynchronous code. Let’s use node to read our number from a file called number.txt:

var fs = require('fs') // require is a special function provided by node
var myNumber = undefined // we don't know what the number is yet since it is stored in a file

function addOne() {
  fs.readFile('number.txt', function doneReading(err, fileContents) {
    myNumber = parseInt(fileContents)
    myNumber++
  })
}

addOne()

console.log(myNumber) // logs out undefined -- this line gets run before readFile is done

Why do we get undefined when we log out the number this time? In this code we use the fs.readFile method, which happens to be an asynchronous method. Usually things that have to talk to hard drives or networks will be asynchronous. If they just have to access things in memory or do some work on the CPU they will be synchronous. The reason for this is that I/O is reallyyy reallyyy sloowwww. A ballpark figure would be that talking to a hard drive is about 100,000 times slower than talking to memory (e.g. RAM).

When we run this program all of the functions are immediately defined, but they don’t all execute immediately. This is a fundamental thing to understand about async programming. When addOne is called it kicks off a readFile and then moves on to the next thing that is ready to execute. If there is nothing to execute node will either wait for pending fs/network operations to finish or it will stop running and exit to the command line.

When readFile is done reading the file (this may take anywhere from milliseconds to seconds to minutes depending on how fast the hard drive is) it will run the doneReading function and give it an error (if there was an error) and the file contents.

The reason we got undefined above is that nowhere in our code exists logic that tells the console.log statement to wait until the readFile statement finishes before it prints out the number.

If you have some code that you want to be able to execute over and over again, or at a later time, the first step is to put that code inside a function. Then you can call the function whenever you want to run your code. It helps to give your functions descriptive names.

Callbacks are just functions that get executed at some later time. The key to understanding callbacks is to realize that they are used when you don’t know when some async operation will complete, but you do know where the operation will complete — the last line of the async function! The top-to-bottom order that you declare callbacks does not necessarily matter, only the logical/hierarchical nesting of them. First you split your code up into functions, and then use callbacks to declare if one function depends on another function finishing.

The fs.readFile method is provided by node, is asynchronous, and happens to take a long time to finish. Consider what it does: it has to go to the operating system, which in turn has to go to the file system, which lives on a hard drive that may or may not be spinning at thousands of revolutions per minute. Then it has to use a magnetic head to read data and send it back up through the layers back into your javascript program. You give readFile a function (known as a callback) that it will call after it has retrieved the data from the file system. It puts the data it retrieved into a javascript variable and calls your function (callback) with that variable. In this case the variable is called fileContents because it contains the contents of the file that was read.

Think of the restaurant example at the beginning of this tutorial. At many restaurants you get a number to put on your table while you wait for your food. These are a lot like callbacks. They tell the server what to do after your cheeseburger is done.

Let’s put our console.log statement into a function and pass it in as a callback:

var fs = require('fs')
var myNumber = undefined

function addOne(callback) {
  fs.readFile('number.txt', function doneReading(err, fileContents) {
    myNumber = parseInt(fileContents)
    myNumber++
    callback()
  })
}

function logMyNumber() {
  console.log(myNumber)
}

addOne(logMyNumber)

Now the logMyNumber function can get passed in as an argument that will become the callback variable inside the addOne function. After readFile is done the callback variable will be invoked (callback()). Only functions can be invoked, so if you pass in anything other than a function it will cause an error.

When a function gets invoked in javascript the code inside that function will immediately get executed. In this case our log statement will execute since callback is actually logMyNumber. Remember, just because you define a function it doesn’t mean it will execute. You have to invoke a function for that to happen.

To break down this example even more, here is a timeline of events that happen when we run this program:

  • 1: The code is parsed, which means if there are any syntax errors they would make the program break. During this initial phase, fs and myNumber are declared as variables while addOne and logMyNumber are declared as functions. Note that these are just declarations. Neither function has been called nor invoked yet.
  • 2: When the last line of our program gets executed addOne is invoked with the logMyNumber function passed as its callback argument. Invoking addOne will first run the asynchronous fs.readFile function. This part of the program takes a while to finish.
  • 3: With nothing to do, node idles for a bit as it waits for readFile to finish. If there was anything else to do during this time, node would be available for work.
  • 4: As soon as readFile finishes it executes its callback, doneReading, which parses fileContents for an integer called myNumber, increments myNumber and then immediately invokes the function that addOne passed in (its callback), logMyNumber.

Perhaps the most confusing part of programming with callbacks is how functions are just objects that can be stored in variables and passed around with different names. Giving simple and descriptive names to your variables is important in making your code readable by others. Generally speaking in node programs when you see a variable like callback or cb you can assume it is a function.

You may have heard the terms ‘evented programming’ or ‘event loop’. They refer to the way that readFile is implemented. Node first dispatches the readFile operation and then waits for readFile to send it an event that it has completed. While it is waiting node can go check on other things. Inside node there is a list of things that are dispatched but haven’t reported back yet, so node loops over the list again and again checking to see if they are finished. After they finished they get ‘processed’, e.g. any callbacks that depended on them finishing will get invoked.

Here is a pseudocode version of the above example:

function addOne(thenRunThisFunction) {
  waitAMinuteAsync(function waitedAMinute() {
    thenRunThisFunction()
  })
}

addOne(function thisGetsRunAfterAddOneFinishes() {})

Imagine you had 3 async functions a, b and c. Each one takes 1 minute to run and after it finishes it calls a callback (that gets passed in the first argument). If you wanted to tell node ‘start running a, then run b after a finishes, and then run c after b finishes’ it would look like this:

a(function() {
  b(function() {
    c()
  })
})

When this code gets executed, a will immediately start running, then a minute later it will finish and call b, then a minute later it will finish and call c and finally 3 minutes later node will stop running since there would be nothing more to do. There are definitely more elegant ways to write the above example, but the point is that if you have code that has to wait for some other async code to finish then you express that dependency by putting your code in functions that get passed around as callbacks.

The design of node requires you to think non-linearly. Consider this list of operations:

read a file
process that file

If you were to turn this into pseudocode you would end up with this:

var file = readFile()
processFile(file)

This kind of linear (step-by-step, in order) code isn’t the way that node works. If this code were to get executed then readFile and processFile would both get executed at the same exact time. This doesn’t make sense since readFile will take a while to complete. Instead you need to express that processFile depends on readFile finishing. This is exactly what callbacks are for! And because of the way that JavaScript works you can write this dependency many different ways:

var fs = require('fs')
fs.readFile('movie.mp4', finishedReading)

function finishedReading(error, movieData) {
  if (error) return console.error(error)
  // do something with the movieData
}

But you could also structure your code like this and it would still work:

var fs = require('fs')

function finishedReading(error, movieData) {
  if (error) return console.error(error)
  // do something with the movieData
}

fs.readFile('movie.mp4', finishedReading)

Or even like this:

var fs = require('fs')

fs.readFile('movie.mp4', function finishedReading(error, movieData) {
  if (error) return console.error(error)
  // do something with the movieData
})

Events

In node if you require the events module you can use the so-called ‘event emitter’ that node itself uses for all of its APIs that emit things.

Events are a common pattern in programming, known more widely as the ‘observer pattern’ or ‘pub/sub’ (publish/subscribe). Whereas callbacks are a one-to-one relationship between the thing waiting for the callback and the thing calling the callback, events are the same exact pattern except with a many-to-many API.

The easiest way to think about events is that they let you subscribe to things. You can say ‘when X do Y’, whereas with plain callbacks it is ‘do X then Y’.

Here are few common use cases for using events instead of plain callbacks:

  • Chat room where you want to broadcast messages to many listeners
  • Game server that needs to know when new players connect, disconnect, move, shoot and jump
  • Game engine where you want to let game developers subscribe to events like .on('jump', function() {})
  • A low level web server that wants to expose an API to easily hook into events that happen like .on('incomingRequest') or .on('serverError')

If we were trying to write a module that connects to a chat server using only callbacks it would look like this:

var chatClient = require('my-chat-client')

function onConnect() {
  // have the UI show we are connected
}

function onConnectionError(error) {
  // show error to the user
}

function onDisconnect() {
 // tell user that they have been disconnected
}

function onMessage(message) {
 // show the chat room message in the UI
}

chatClient.connect(
  'http://mychatserver.com',
  onConnect,
  onConnectionError,
  onDisconnect,
  onMessage
)

As you can see this is really cumbersome because of all of the functions that you have to pass in a specific order to the .connect function. Writing this with events would look like this:

var chatClient = require('my-chat-client').connect()

chatClient.on('connect', function() {
  // have the UI show we are connected
})

chatClient.on('connectionError', function() {
  // show error to the user
})

chatClient.on('disconnect', function() {
  // tell user that they have been disconnected
})

chatClient.on('message', function() {
  // show the chat room message in the UI
})

This approach is similar to the pure-callback approach but introduces the .on method, which subscribes a callback to an event. This means you can choose which events you want to subscribe to from the chatClient. You can also subscribe to the same event multiple times with different callbacks:

var chatClient = require('my-chat-client').connect()
chatClient.on('message', logMessage)
chatClient.on('message', storeMessage)

function logMessage(message) {
  console.log(message)
}

function storeMessage(message) {
  myDatabase.save(message)
}

Streams

Early on in the node project the file system and network APIs had their own separate patterns for dealing with streaming I/O. For example, files in a file system have things called ‘file descriptors’ so the fs module had to have extra logic to keep track of these things whereas the network modules didn’t have such a concept. Despite minor differences in semantics like these, at a fundamental level both groups of code were duplicating a lot of functionality when it came to reading data in and out. The team working on node realized that it would be confusing to have to learn two sets of semantics to essentially do the same thing so they made a new API called the Stream and made all the network and file system code use it.

The whole point of node is to make it easy to deal with file systems and networks so it made sense to have one pattern that was used everywhere. The good news is that most of the patterns like these (there are only a few anyway) have been figured out at this point and it is very unlikely that node will change that much in the future.

There are already two great resources that you can use to learn about node streams. One is the stream-adventure (see the Learn Node Interactively section) and the other is a reference called the Stream Handbook.

Stream Handbook

stream-handbook is a guide, similar to this one, that contains a reference for everything you could possibly need to know about streams.

stream-handbook

Modules

Node core is made up of about two dozen modules, some lower level ones like events and stream some higher level ones like http and crypto.

This design is intentional. Node core is supposed to be small, and the modules in core should be focused on providing tools for working with common I/O protocols and formats in a way that is cross-platform.

For everything else there is npm. Anyone can create a new node module that adds some functionality and publish it to npm. As of the time of this writing there are 34,000 modules on npm.

How to find a module

Imagine you are trying to convert PDF files into TXT files. The best place to start is by doing npm search pdf:

pdfsearch

There are a ton of results! npm is quite popular and you will usually be able to find multiple potential solutions. If you go through each module and whittle down the results into a more narrow set (filtering out things like PDF generation modules) you’ll end up with these:

A lot of the modules have overlapping functionality but present alternate APIs and most of them require external dependencies (like apt-get install poppler).

Here are some different ways to interpret the modules:

  • pdf2json is the only one that is written in pure JavaScript, which means it is the easiest to install, especially on low power devices like the raspberry pi or on Windows where native code might not be cross platform.
  • modules like mimeograph, hummus and pdf-extract each combine multiple lower level modules to expose a high level API
  • a lot of modules seem to sit on top of the pdftotext/poppler unix command line tools

Lets compare the differences between pdftotextjs and pdf-text-extract, both of which are are wrappers around the pdftotext utility.

pdf-modules

Both of these:

  • were updated relatively recently
  • have github repositories linked (this is very important!)
  • have READMEs
  • have at least some number of people installing them every week
  • are liberally licensed (anyone can use them)

Just looking at the package.json + module statistics it’s hard to get a feeling about which one might be the right choice. Let’s compare the READMEs:

pdf-readmes

Both have simple descriptions, CI badges, installation instructions, clear examples and instructions for running the tests. Great! But which one do we use? Let’s compare the code:

pdf-code

pdftotextjs is around 110 lines of code, and pdf-text-extract is around 40, but both essentially boil down to this line:

var child = shell.exec('pdftotext ' + self.options.additional.join(' '));

Does this make one any better than the other? Hard to say! It’s important to actually read the code and make your own conclusions. If you find a module you like, use npm star modulename to give npm feedback about modules that you had a positive experience with.

Modular development workflow

npm is different from most package managers in that it installs modules into a folder inside of other existing modules. The previous sentence might not make sense right now but it is the key to npm’s success.

Many package managers install things globally. For instance, if you apt-get install couchdb on Debian Linux it will try to install the latest stable version of CouchDB. If you are trying to install CouchDB as a dependency of some other piece of software and that software needs an older version of CouchDB, you have to uninstall the newer version of CouchDB and then install the older version. You can’t have two versions of CouchDB installed because Debian only knows how to install things into one place.

It’s not just Debian that does this. Most programming language package managers work this way too. To address the global dependencies problem described above there have been virtual environment developed like virtualenv for Python or bundler for Ruby. These just split your environment up in to many virtual environments, one for each project, but inside each environment dependencies are still globally installed. Virtual environments don’t always solve the problem, sometimes they just multiply it by adding additional layers of complexity.

With npm installing global modules is an anti-pattern. Just like how you shouldn’t use global variables in your JavaScript programs you also shouldn’t install global modules (unless you need a module with an executable binary to show up in your global PATH, but you don’t always need to do this – more on this later).

How require works

When you call require('some_module') in node here is what happens:

  1. if a file called some_module.js exists in the current folder node will load that, otherwise:
  2. node looks in the current folder for a node_modules folder with a some_module folder in it
  3. if it doesn’t find it, it will go up one folder and repeat step 2

This cycle repeats until node reaches the root folder of the filesystem, at which point it will then check any global module folders (e.g. /usr/local/node_modules on Mac OS) and if it still doesn’t find some_module it will throw an exception.

Here’s a visual example:

mod-diagram-01

When the current working directory is subsubfolder and require('foo') is called, node will look for the folder called subsubfolder/node_modules. In this case it won’t find it – the folder there is mistakenly called my_modules. Then node will go up one folder and try again, meaning it then looks for subfolder_B/node_modules, which also doesn’t exist. Third try is a charm, though, as folder/node_modules does exist and has a folder called foo inside of it. If foo wasn’t in there node would continue its search up the directory tree.

Note that if called from subfolder_B node will never find subfolder_A/node_modules, it can only see folder/node_modules on its way up the tree.

One of the benefits of npm’s approach is that modules can install their dependent modules at specific known working versions. In this case the module foo is quite popular - there are three copies of it, each one inside its parent module folder. The reasoning for this could be that each parent module needed a different version of foo, e.g. ‘folder’ needs foo@0.0.1, subfolder_A needs foo@0.2.1 etc.

Here’s what happens when we fix the folder naming error by changing my_modules to the correct name node_modules:

mod-diagram-02

To test out which module actually gets loaded by node, you can use the require.resolve('some_module') command, which will show you the path to the module that node finds as a result of the tree climbing process. require.resolve can be useful when double-checking that the module that you think is getting loaded is actually getting loaded – sometimes there is another version of the same module closer to your current working directory than the one you intend to load.

How to write a module

Now that you know how to find modules and require them you can start writing your own modules.

The simplest possible module

Node modules are radically lightweight. Here is one of the simplest possible node modules:

package.json:

{
  "name": "number-one",
  "version": "1.0.0"
}

index.js:

module.exports = 1

By default node tries to load module/index.js when you require('module'), any other file name won’t work unless you set the main field of package.json to point to it.

Put both of those files in a folder called number-one (the name in package.json must match the folder name) and you’ll have a working node module.

Calling the function require('number-one') returns the value of whatever module.exports is set to inside the module:

simple-module

An even quicker way to create a module is to run these commands:

mkdir my_module
cd my_module
git init
git remote add git@github.com:yourusername/my_module.git
npm init

Running npm init will create a valid package.json for you and if you run it in an existing git repo it will set the repositories field inside package.json automatically as well!

Adding dependencies

A module can list any other modules from npm or GitHub in the dependencies field of package.json. To install the request module as a new dependency and automatically add it to package.json run this from your module root directory:

npm install --save request

This installs a copy of request into the closest node_modules folder and makes our package.json look something like this:

{
  "id": "number-one",
  "version": "1.0.0",
  "dependencies": {
    "request": "~2.22.0"
  }
}

By default npm install will grab the latest published version of a module.

Client side development with npm

A common misconception about npm is that since it has ‘Node’ in the name that it must only be used for server side JS modules. This is completely untrue! npm actually stands for Node Packaged Modules, e.g. modules that Node packages together for you. The modules themselves can be whatever you want – they are just a folder of files wrapped up in a .tar.gz, and a file called package.json that declares the module version and a list of all modules that are dependencies of the module (as well as their version numbers so the working versions get installed automatically). It’s turtles all the way down - module dependencies are just modules, and those modules can have dependencies etc. etc. etc.

browserify is a utility written in Node that tries to convert any node module into code that can be run in browsers. Not all modules work (browsers can’t do things like host an HTTP server), but a lot of modules on NPM will work.

To try out npm in the browser you can use RequireBin, an app I made that takes advantage of Browserify-CDN, which internally uses browserify but returns the output through HTTP (instead of the command line – which is how browserify is usually used).

Try putting this code into RequireBin and then hit the preview button:

var reverse = require('ascii-art-reverse')

// makes a visible HTML console
require('console-log').show(true)

var coolbear =
  "    ('-^-/')  \n" +
  "    `o__o' ]  \n" +
  "    (_Y_) _/  \n" +
  "  _..`--'-.`, \n" +
  " (__)_,--(__) \n" +
  "     7:   ; 1 \n" +
  "   _/,`-.-' : \n" +
  "  (_,)-~~(_,) \n"

setInterval(function() { console.log(coolbear) }, 1000)

setTimeout(function() {
  setInterval(function() { console.log(reverse(coolbear)) }, 1000)
}, 500)

Or check out a more complicated example (feel free to change the code and see what happens):

requirebin

Going with the grain

Like any good tool, node is best suited for a certain set of use cases. For example: Rails, the popular web framework, is great for modeling complex business logic, e.g. using code to represent real life business objects like accounts, loan, itineraries, and inventories. While it is technically possible to do the same type of thing using node, there would be definite drawbacks since node is designed for solving I/O problems and it doesn’t know much about ‘business logic’. Each tool focuses on different problems. Hopefully this guide will help you gain an intuitive understanding of the strengths of node so that you know when it can be useful to you.

What is outside of node’s scope?

Fundamentally node is just a tool used for managing I/O across file systems and networks, and it leaves other more fancy functionality up to third party modules. Here are some things that are outside the scope of node:

Web frameworks

There are a number of web frameworks built on top of node (framework meaning a bundle of solutions that attempts to address some high level problem like modeling business logic), but node is not a web framework. Web frameworks that are written using node don’t always make the same kind of decisions about adding complexity, abstractions and tradeoffs that node does and may have other priorities.

Language syntax

Node uses JavaScript and doesn’t change anything about it. Felix Geisendörfer has a pretty good write-up of the ‘node style’ here.

Language abstraction

When possible node will use the simplest possible way of accomplishing something. The ‘fancier’ you make your JavaScript the more complexity and tradeoffs you introduce. Programming is hard, especially in JS where there are 1000 solutions to every problem! It is for this reason that node tries to always pick the simplest, most universal option. If you are solving a problem that calls for a complex solution and you are unsatisfied with the ‘vanilla JS solutions’ that node implements, you are free to solve it inside your app or module using whichever abstractions you prefer.

A great example of this is node’s use of callbacks. Early on node experimented with a feature called ‘promises’ that added a number of features to make async code appear more linear. It was taken out of node core for a few reasons:

  • they are more complex than callbacks
  • they can be implemented in userland (distributed on npm as third party modules)

Consider one of the most universal and basic things that node does: reading a file. When you read a file you want to know when errors happen, like when your hard drive dies in the middle of your read. If node had promises everyone would have to branch their code like this:

fs.readFile('movie.mp4')
  .then(function(data) {
    // do stuff with data
  })
  .error(function(error) {
    // handle error
  })

This adds complexity, and not everyone wants that. Instead of two separate functions node just uses a single callback function. Here are the rules:

  • When there is no error pass null as the first argument
  • When there is an error, pass it as the first argument
  • The rest of the arguments can be used for anything (usually data or responses since most stuff in node is reading or writing things)

Hence, the node callback style:

fs.readFile('movie.mp4', function(err, data) {
  // handle error, do stuff with data
})

Threads/fibers/non-event-based concurrency solutions

Note: If you don’t know what these things mean then you will likely have an easier time learning node, since unlearning things is just as much work as learning things.

Node uses threads internally to make things fast but doesn’t expose them to the user. If you are a technical user wondering why node is designed this way then you should 100% read about the design of libuv, the C++ I/O layer that node is built on top of.

License

CCBY

Creative Commons Attribution License (do whatever, just attribute me)
http://creativecommons.org/licenses/by/2.0/

Donate icon is from the Noun Project

Real Programmers Don’t Use PASCAL by Ed Post, Copyright (c) 1982 More Pasta

Back in the good old days – the “Golden Era” of computers, it was easy to separate the men from the boys (sometimes called “Real Men” and “Quiche Eaters” in the literature). During this period, the Real Men were the ones that understood computer programming, and the Quiche Eaters were the ones that didn’t. A real computer programmer said things like “DO 10 I=1,10” and “ABEND” (they actually talked in capital letters, you understand), and the rest of the world said things like “computers are too complicated for me” and “I can’t relate to computers – they’re so impersonal”. (A previous work [1] points out that Real Men don’t “relate” to anything, and aren’t afraid of being impersonal.)

But, as usual, times change. We are faced today with a world in which little old ladies can get computerized microwave ovens, 12 year old kids can blow Real Men out of the water playing Asteroids and Pac-Man, and anyone can buy and even understand their very own Personal Computer. The Real Programmer is in danger of becoming extinct, of being replaced by high-school students with TRASH-80s!

There is a clear need to point out the differences between the typical high-school junior Pac-Man player and a Real Programmer. Understanding these differences will give these kids something to aspire to – a role model, a Father Figure. It will also help employers of Real Programmers to realize why it would be a mistake to replace the Real Programmers on their staff with 12 year old Pac-Man players (at a considerable salary savings).

LANGUAGES

The easiest way to tell a Real Programmer from the crowd is by the programming language he (or she) uses. Real Programmers use FORTRAN. Quiche Eaters use PASCAL. Nicklaus Wirth, the designer of PASCAL, was once asked, “How do you pronounce your name?”. He replied “You can either call me by name, pronouncing it ‘Veert’, or call me by value, ‘Worth’.” One can tell immediately from this comment that Nicklaus Wirth is a Quiche Eater. The only parameter passing mechanism endorsed by Real Programmers is call-by-value-return, as implemented in the IBM/370 FORTRAN G and H compilers. Real programmers don’t need abstract concepts to get their jobs done: they are perfectly happy with a keypunch, a FORTRAN IV compiler, and a beer.

  • Real Programmers do List Processing in FORTRAN.
  • Real Programmers do String Manipulation in FORTRAN.
  • Real Programmers do Accounting (if they do it at all) in FORTRAN.
  • Real Programmers do Artificial Intelligence programs in FORTRAN.

If you can’t do it in FORTRAN, do it in assembly language. If you can’t do it in assembly language, it isn’t worth doing.

STRUCTURED PROGRAMMING

Computer science academicians have gotten into the “structured programming” rut over the past several years. They claim that programs are more easily understood if the programmer uses some special language constructs and techniques. They don’t all agree on exactly which constructs, of course, and the examples they use to show their particular point of view invariably fit on a single page of some obscure journal or another – clearly not enough of an example to convince anyone. When I got out of school, I thought I was the best programmer in the world. I could write an unbeatable tic-tac-toe program, use five different computer languages, and create 1000 line programs that WORKED. (Really!) Then I got out into the Real World. My first task in the Real World was to read and understand a 200,000 line FORTRAN program, then speed it up by a factor of two. Any Real Programmer will tell you that all the Structured Coding in the world won’t help you solve a problem like that – it takes actual talent. Some quick observations on Real Programmers and Structured Programming:

  • Real Programmers aren’t afraid to use GOTOs.
  • Real Programmers can write five page long DO loops without getting confused.
  • Real Programmers enjoy Arithmetic IF statements because they make the code more interesting.
  • Real Programmers write self-modifying code, especially if it saves them 20 nanoseconds in the middle of a tight loop.
  • Programmers don’t need comments: the code is obvious.
  • Since FORTRAN doesn’t have a structured IF, REPEAT … UNTIL, or CASE statement, Real Programmers don’t have to worry about not using them. Besides, they can be simulated when necessary using assigned GOTOs.

Data structures have also gotten a lot of press lately. Abstract Data Types, Structures, Pointers, Lists, and Strings have become popular in certain circles. Wirth (the above-mentioned Quiche Eater) actually wrote an entire book [2] contending that you could write a program based on data structures, instead of the other way around. As all Real Programmers know, the only useful data structure is the array. Strings, lists, structures, sets – these are all special cases of arrays and and can be treated that way just as easily without messing up your programing language with all sorts of complications. The worst thing about fancy data types is that you have to declare them, and Real Programming Languages, as we all know, have implicit typing based on the first letter of the (six character) variable name.

OPERATING SYSTEMS

What kind of operating system is used by a Real Programmer? CP/M? God forbid – CP/M, after all, is basically a toy operating system. Even little old ladies and grade school students can understand and use CP/M.

Unix is a lot more complicated of course – the typical Unix hacker never can remember what the PRINT command is called this week – but when it gets right down to it, Unix is a glorified video game. People don’t do Serious Work on Unix systems: they send jokes around the world on USENET and write adventure games and research papers.

No, your Real Programmer uses OS/370. A good programmer can find and understand the description of the IJK305I error he just got in his JCL manual. A great programmer can write JCL without referring to the manual at all. A truly outstanding programmer can find bugs buried in a 6 megabyte core dump without using a hex calculator. (I have actually seen this done.)

OS/370 is a truly remarkable operating system. It’s possible to destroy days of work with a single misplaced space, so alertness in the programming staff is encouraged. The best way to approach the system is through a keypunch. Some people claim there is a Time Sharing system that runs on OS/370, but after careful study I have come to the conclusion that they are mistaken.

PROGRAMMING TOOLS

What kind of tools does a Real Programmer use? In theory, a Real Programmer could run his programs by keying them into the front panel of the computer. Back in the days when computers had front panels, this was actually done occasionally. Your typical Real Programmer knew the entire bootstrap loader by memory in hex, and toggled it in whenever it got destroyed by his program. (Back then, memory was memory – it didn’t go away when the power went off. Today, memory either forgets things when you don’t want it to, or remembers things long after they’re better forgotten.) Legend has it that Seymour Cray, inventor of the Cray I supercomputer and most of Control Data’s computers, actually toggled the first operating system for the CDC7600 in on the front panel from memory when it was first powered on. Seymour, needless to say, is a Real Programmer.

One of my favorite Real Programmers was a systems programmer for Texas Instruments. One day, he got a long distance call from a user whose system had crashed in the middle of some important work. Jim was able to repair the damage over the phone, getting the user to toggle in disk I/O instructions at the front panel, repairing system tables in hex, reading register contents back over the phone. The moral of this story: while a Real Programmer usually includes a keypunch and lineprinter in his toolkit, he can get along with just a front panel and a telephone in emergencies.

In some companies, text editing no longer consists of ten engineers standing in line to use an 029 keypunch. In fact, the building I work in doesn’t contain a single keypunch. The Real Programmer in this situation has to do his work with a text editor program. Most systems supply several text editors to select from, and the Real Programmer must be careful to pick one that reflects his personal style. Many people believe that the best text editors in the world were written at Xerox Palo Alto Research Center for use on their Alto and Dorado computers [3]. Unfortunately, no Real Programmer would ever use a computer whose operating system is called SmallTalk, and would certainly not talk to the computer with a mouse.

Some of the concepts in these Xerox editors have been incorporated into editors running on more reasonably named operating systems. EMACS and VI are probably the most well known of this class of editors. The problem with these editors is that Real Programmers consider “what you see is what you get” to be just as bad a concept in text editors as it is in women. No, the Real Programmer wants a “you asked for it, you got it” text editor – complicated, cryptic, powerful, unforgiving, dangerous. TECO, to be precise.

It has been observed that a TECO command sequence more closely resembles transmission line noise than readable text [4]. One of the more entertaining games to play with TECO is to type your name in as a command line and try to guess what it does. Just about any possible typing error while talking with TECO will probably destroy your program, or even worse – introduce subtle and mysterious bugs in a once working subroutine.

For this reason, Real Programmers are reluctant to actually edit a program that is close to working. They find it much easier to just patch the binary object code directly, using a wonderful program called SUPERZAP (or its equivalent on non-IBM machines). This works so well that many working programs on IBM systems bear no relation to the original FORTRAN code. In many cases, the original source code is no longer available. When it comes time to fix a program like this, no manager would even think of sending anything less than a Real Programmer to do the job – no Quiche Eating structured programmer would even know where to start. This is called “job security”.

Some programming tools NOT used by Real Programmers:

  • FORTRAN preprocessors like MORTRAN and RATFOR. The Cuisinarts of programming – great for making Quiche. See comments above on structured programming.
  • Source language debuggers. Real Programmers can read core dumps.
  • Compilers with array bounds checking. They stifle creativity, destroy most of the interesting uses for EQUIVALENCE, and make it impossible to modify the operating system code with negative subscripts. Worst of all, bounds checking is inefficient.
  • Source code maintainance systems. A Real Programmer keeps his code locked up in a card file, because it implies that its owner cannot leave his important programs unguarded [5].

THE REAL PROGRAMMER AT WORK

Where does the typical Real Programmer work? What kind of programs are worthy of the efforts of so talented an individual? You can be sure that no real Programmer would be caught dead writing accounts-receivable programs in COBOL, or sorting mailing lists for People magazine. A Real Programmer wants tasks of earth-shaking importance (literally!):

  • Real Programmers work for Los Alamos National Laboratory, writing atomic bomb simulations to run on Cray I supercomputers.
  • Real Programmers work for the National Security Agency, decoding Russian transmissions.
  • It was largely due to the efforts of thousands of Real Programmers working for NASA that our boys got to the moon and back before the cosmonauts.
  • The computers in the Space Shuttle were programmed by Real Programmers.
  • Programmers are at work for Boeing designing the operating systems for cruise missiles.

Some of the most awesome Real Programmers of all work at the Jet Propulsion Laboratory in California. Many of them know the entire operating system of the Pioneer and Voyager spacecraft by heart. With a combination of large ground-based FORTRAN programs and small spacecraft-based assembly language programs, they can to do incredible feats of navigation and improvisation, such as hitting ten-kilometer wide windows at Saturn after six years in space, and repairing or bypassing damaged sensor platforms, radios, and batteries. Allegedly, one Real Programmer managed to tuck a pattern-matching program into a few hundred bytes of unused memory in a Voyager spacecraft that searched for, located, and photographed a new moon of Jupiter.

One plan for the upcoming Galileo spacecraft mission is to use a gravity assist trajectory past Mars on the way to Jupiter. This trajectory passes within 80 +/- 3 kilometers of the surface of Mars. Nobody is going to trust a PASCAL program (or PASCAL programmer) for navigation to these tolerances.

As you can tell, many of the world’s Real Programmers work for the U.S. Government, mainly the Defense Department. This is as it should be. Recently, however, a black cloud has formed on the Real Programmer horizon.

It seems that some highly placed Quiche Eaters at the Defense Department decided that all Defense programs should be written in some grand unified language called “ADA” (registered trademark, DoD). For a while, it seemed that ADA was destined to become a language that went against all the precepts of Real Programming – a language with structure, a language with data types, strong typing, and semicolons. In short, a language designed to cripple the creativity of the typical Real Programmer. Fortunately, the language adopted by DoD has enough interesting features to make it approachable: it’s incredibly complex, includes methods for messing with the operating system and rearranging memory, and Edsgar Dijkstra doesn’t like it [6]. (Dijkstra, as I’m sure you know, was the author of “GoTos Considered Harmful” – a landmark work in programming methodology, applauded by Pascal Programmers and Quiche Eaters alike.) Besides, the determined Real Programmer can write FORTRAN programs in any language.

The real programmer might compromise his principles and work on something slightly more trivial than the destruction of life as we know it, providing there’s enough money in it. There are several Real Programmers building video games at Atari, for example. (But not playing them. A Real Programmer knows how to beat the machine every time: no challange in that.) Everyone working at LucasFilm is a Real Programmer. (It would be crazy to turn down the money of 50 million Star Wars fans.) The proportion of Real Programmers in Computer Graphics is somewhat lower than the norm, mostly because nobody has found a use for Computer Graphics yet. On the other hand, all Computer Graphics is done in FORTRAN, so there are a fair number people doing Graphics in order to avoid having to write COBOL programs.

THE REAL PROGRAMMER AT PLAY

Generally, the Real Programmer plays the same way he works – with computers. He is constantly amazed that his employer actually pays him to do what he would be doing for fun anyway, although he is careful not to express this opinion out loud. Occasionally, the Real Programmer does step out of the office for a breath of fresh air and a beer or two. Some tips on recognizing real programmers away from the computer room:

  • At a party, the Real Programmers are the ones in the corner talking about operating system security and how to get around it.
  • At a football game, the Real Programmer is the one comparing the plays against his simulations printed on 11 by 14 fanfold paper.
  • At the beach, the Real Programmer is the one drawing flowcharts in the sand.
  • A Real Programmer goes to a disco to watch the light show.
  • At a funeral, the Real Programmer is the one saying “Poor George. And he almost had the sort routine working before the coronary.”
  • In a grocery store, the Real Programmer is the one who insists on running the cans past the laser checkout scanner himself, because he never could trust keypunch operators to get it right the first time.

THE REAL PROGRAMMER’S NATURAL HABITAT

What sort of environment does the Real Programmer function best in? This is an important question for the managers of Real Programmers. Considering the amount of money it costs to keep one on the staff, it’s best to put him (or her) in an environment where he can get his work done.

The typical Real Programmer lives in front of a computer terminal. Surrounding this terminal are:

  • Listings of all programs the Real Programmer has ever worked on, piled in roughly chronological order on every flat surface in the office.
  • Some half-dozen or so partly filled cups of cold coffee. Occasionally, there will be cigarette butts floating in the coffee. In some cases, the cups will contain Orange Crush.
  • Unless he is very good, there will be copies of the OS JCL manual and the Principles of Operation open to some particularly interesting pages.
  • Taped to the wall is a line-printer Snoopy calender for the year 1969.
  • Strewn about the floor are several wrappers for peanut butter filled cheese bars (the type that are made stale at the bakery so they can’t get any worse while waiting in the vending machine).
  • Hiding in the top left-hand drawer of the desk is a stash of double stuff Oreos for special occasions.
  • Underneath the Oreos is a flow-charting template, left there by the previous occupant of the office. (Real Programmers write programs, not documentation. Leave that to the maintainence people.)

The Real Programmer is capable of working 30, 40, even 50 hours at a stretch, under intense pressure. In fact, he prefers it that way. Bad response time doesn’t bother the Real Programmer – it gives him a chance to catch a little sleep between compiles. If there is not enough schedule pressure on the Real Programmer, he tends to make things more challenging by working on some small but interesting part of the problem for the first nine weeks, then finishing the rest in the last week, in two or three 50-hour marathons. This not only inpresses his manager, who was despairing of ever getting the project done on time, but creates a convenient excuse for not doing the documentation. In general:

  • No Real Programmer works 9 to 5. (Unless it’s 9 in the evening to 5 in the morning.)
  • Real Programmers don’t wear neckties.
  • Real Programmers don’t wear high heeled shoes.
  • Real Programmers arrive at work in time for lunch. [9]
  • A Real Programmer might or might not know his wife’s name. He does, however, know the entire ASCII (or EBCDIC) code table.
  • Real Programmers don’t know how to cook. Grocery stores aren’t often open at 3 a.m., so they survive on Twinkies and coffee.

THE FUTURE

What of the future? It is a matter of some concern to Real Programmers that the latest generation of computer programmers are not being brought up with the same outlook on life as their elders. Many of them have never seen a computer with a front panel. Hardly anyone graduating from school these days can do hex arithmetic without a calculator. College graduates these days are soft – protected from the realities of programming by source level debuggers, text editors that count parentheses, and user friendly operating systems. Worst of all, some of these alleged computer scientists manage to get degrees without ever learning FORTRAN! Are we destined to become an industry of Unix hackers and Pascal programmers?

On the contrary. From my experience, I can only report that the future is bright for Real Programmers everywhere. Neither OS/370 nor FORTRAN show any signs of dying out, despite all the efforts of Pascal programmers the world over. Even more subtle tricks, like adding structured coding constructs to FORTRAN have failed. Oh sure, some computer vendors have come out with FORTRAN 77 compilers, but every one of them has a way of converting itself back into a FORTRAN 66 compiler at the drop of an option card – to compile DO loops like God meant them to be.

Even Unix might not be as bad on Real Programmers as it once was. The latest release of Unix has the potential of an operating system worthy of any Real Programmer. It has two different and subtly incompatible user interfaces, an arcane and complicated terminal driver, virtual memory. If you ignore the fact that it’s structured, even C programming can be appreciated by the Real Programmer: after all, there’s no type checking, variable names are seven (ten? eight?) characters long, and the added bonus of the Pointer data type is thrown in. It’s like having the best parts of FORTRAN and assembly language in one place. (Not to mention some of the more creative uses for #define.)

No, the future isn’t all that bad. Why, in the past few years, the popular press has even commented on the bright new crop of computer nerds and hackers ([7] and [8]) leaving places like Stanford and M.I.T. for the Real World. From all evidence, the spirit of Real Programming lives on in these young men and women. As long as there are ill-defined goals, bizarre bugs, and unrealistic schedules, there will be Real Programmers willing to jump in and Solve The Problem, saving the documentation for later. Long live FORTRAN!

ACKNOWLEGEMENT

I would like to thank Jan E., Dave S., Rich G., Rich E. for their help in characterizing the Real Programmer, Heather B. for the illustration, Kathy E. for putting up with it, and atd!avsdS:mark for the initial inspriration.

REFERENCES

  1. Feirstein, B., Real Men Don’t Eat Quiche, New York, Pocket Books, 1982.
  2. Wirth, N., Algorithms + Datastructures = Programs, Prentice Hall, 1976.
  3. Xerox PARC editors . . .
  4. Finseth, C., Theory and Practice of Text Editors - or - a Cookbook for an EMACS, B.S. Thesis, MIT/LCS/TM-165, Massachusetts Institute of Technology, May 1980.
  5. Weinberg, G., The Psychology of Computer Programming, New York, Van Nostrabd Reinhold, 1971, page 110.
  6. Dijkstra, E., On the GREEN Language Submitted to the DoD, Sigplan notices, Volume 3, Number 10, October 1978.
  7. Rose, Frank, Joy of Hacking, Science 82, Volume 3, Number 9, November 1982, pages 58 - 66.
  8. The Hacker Papers, Psychology Today, August 1980.
  9. Datamation, July, 1983, pp. 263-265.

Dorothy Counts

04 September, 1957

Dorothy Counts, the first and at the time only black student to enroll in the newly desegregated Harry Harding High School in Charlotte (NC), is mocked by protestors on her first day of school. Bystanders threw rocks and screamed at Dorothy to go back to where she came from.

The man walking beside her is probably Dr. Edwin Tompkins, a friend of the family and a professor at the black college Johnson C. Smith University. After a string of abuses, Dorothy’s family withdrew her from the school after only four days. Children had been enrolling for the new school year and tension was particularly high in the south for districts trying to comply with the US Supreme Court’s ruling that states should desegregate their schools with deliberate speed.

World Press Photo

I cannot imagine what she must have felt.

“There was unutterable pride, tension and anguish in that girl’s face as she approached the halls of learning, with history jeering at her back,” he later said. “It made me furious. It filled me with both hatred and pity. And it made me ashamed. Some one of us should have been there with her.”

Michael Graff, quoting James Baldwin, “This picture signaled an end to segregation. Why has so little changed?”, The Guardian

Photos by Douglas Martin.

Dorothy Counts by Douglas Martin
Dorothy Counts by Douglas Martin
Dorothy Counts by Douglas Martin
Dorothy Counts by Douglas Martin