Generators and Tasks

Last updated September 15th, 2015

One of the best things coming to ES2015 (ES6) is generators. They are normally presented as a way of dealing with lazy or infinite sequences, but there is another use case which allows them to elegantly solve the "async" problem and eliminate the so-called pyramid of doom.

Generators in a Nutshell

There is a new type of function which can produce many values:

function* genValues() {
  yield 1;
  yield 2;
  yield 4;
  yield 8;
}

And can be consumed by a for..of statement:

for (let val of genValues()) {
  console.log(val);
}

This style of generators can be thought of as a function which produces many values over time. While this is pretty handy, but not game-changing in any way. We could just as easily write functions which perform this task today in vanilla ES3.

The Pyramid of Doom

By design, JavaScript always has a single thread of execution: asynchronous handlers are explicitly passed to functions. This has the benefit of making the language very small, but has the consequence of making non-trivial chaining and accumulating of asynchronous actions unwieldy.

It usually happens when complex sequences of business logic need to be conditionally chained together:

function doComplexThing(alice, bob, callback) {
  getScore(alice, bob, function (err, score) {
    if (err) return callback(err);
    getNextAction(alice, function (err, nextAction) {
      if (err) return callback(err);
      getInfo(alice, bob, function (err, info) {
        if (err) return callback(err);
        function lastStep(err) {
          if (err) return callback(err);
          doSomeFinalThings(score, nextAction, info, callback);
        }
        if (score.isGood() && info.isNotGood()) {
          recordImportantCase(alice, bob, lastStep);
        } else {
          lastStep();
        }
      });
    });
  });
}

Note: the above example is attributed to OK Cupid's release notes of TameJS. Its implementation appears to have been derived from concepts laid out in the 2010 paper "Events Can Make Sense" by Maxwell Krohn, Eddie Kohler, and M. Frans Kaashoek.

Promises, which are also coming in ES2015 (which we can polyfill in ES3) help a bit with the deep nesting and error propagation. Unfortunately, to flatten this pyramid of doom with Promises, we must rely on mutation to carry contextual state between the actions.

To demonstrate, we can rewrite the above example with Promises to flatten the pyramid a bit:

function doComplexThing(alice, bob, callback) {
  var context = {}; // this mutates a lot!
  return getScore(alice, bob)
    .then(function (score) {
      context.score = score;
      return getNextAction(alice);
    })
    .then(function (nextAction) {
      context.nextAction = nextAction;
      return getInfo(alice, bob)
    })
    .then(function (info) {
      context.info = info;
      if (score.isGood() && info.isNotGood()) {
        return recordImportantCase(alice, bob, lastStep);
      } else {
        return Promise.resolve();
      }
    })
    .then(function () {
      doSomeFinalThings(context.score, context.nextAction, context.info, callback);
    });
}

This is better (it reads more linearly and error propagation Just Works™), but it's still cumbersome. Since each action is it's own scope, values that are the result of actions need to be saved for later use within a shared context for future actions. Mutation is one of the most difficult things to wrap our head around while reading code, and this trades more mutation for a more flat pyramid structure that has better error propagation.

But we can do better. And we can do better today without having to rely on the async/await of the unspecified future:

function* doComplexThing(alice, bob) {
  var score = yield getScore(alice, bob);
  var nextAction = yield getNextAction(alice);
  var info = yield getInfo(alice, bob);
  if (score.isGood() && info.isNotGood()) {
    yield recordImportantCase(alice, bob);
  }
  return doSomeFinalThings(score, nextAction, info);
}

Yes, the above function can be fully asynchronous and have equivalent behavior to the original example.

Tasks

ES2015 generators are pretty much congruent to python generators.

First some history: back in 2001, Python generators were introduced, enabling simple generation of multiple values using the yield keyword. In 2005, they were enhanced to allow for bidirectional communication, giving us true coroutines.

Coroutines are a natural way of expressing many algorithms, such as simulations, games, asynchronous I/O, and other forms of event-driven programming or co-operative multitasking.

PEP 342 – Guido van Rossum, Phillip J. Eby

(Emphasis on asynchronous I/O added)

We can use generators to create Tasks: generators which are written in a sequential way, but are executed asynchronously.

In JavaScript (and Python), the yield keyword is an expression that is gives control flow back to the caller, which can "send" back a value, which will be the evaluated result of the yield expression. This allows us to "pause" and "resume with a result" from within an external function:

function* simpleGenerator() {
  console.log('Starting generator...');
  var result = yield 'Marco!';
  console.log('Got', result);
  return 'All done!';
}
function runGenerator() {
  var iterator = simpleGenerator();
  var stepOne = iterator.next(); // logs "Starting generator..."
  console.log('stepOne', stepOne); // logs "stepOne { done: false, value: 'Marco!' }"
  var stepTwo = iterator.next('Polo!'); // logs "Got Polo!"
  console.log('stepTwo', stepTwo); // logs "stepTwo { done: true, value: 'All done!' }"
}
runGenerator();

While executing this, control flow passes back and forth between the runGenerator and simpleGenerator functions. The execution of the inner simpleGenerator function is paused, which allows the runGenerator function to respond to the yield 'Marco!' expression by receiving the yielded value ('Marco!') and send back the result ('Polo!') of the action.

Instead of passing strings around, we can yield a Promise (which represents either a successful value or an error value some time in the future). Then, the driver function could resolve the Promise and send it back to the generator.

Here's a live example, which uses this approach to display info about a user's github account. (Use a browser which natively supports generators and Promises):

JS Bin on jsbin.com

The two tasks that we create are:

function* getMyFollowersAndRepos(username) {
  var userInfo = yield getJsonUrl('https://api.github.com/users/' + username);
  var followersAndRepos = yield Promise.all([ // in parallel!
    getJsonUrl(userInfo.followers_url),
    getJsonUrl(userInfo.repos_url)
  ]);
  return {
    followers: followersAndRepos[0],
    repos: followersAndRepos[1]
  };
}

function* renderInfo() {
  /* -snip- Modify DOM -snip- */
  
  var name = $('.js-name').val();
  var info = yield* getMyFollowersAndRepos(name);
  
  /* -snip- Modify DOM -snip- */
}

We can see that these are written in a linear style and that tasks delegate to other sub-tasks by using yield*.

The thing which powers the execution of these tasks is the "task runner" function:

function runAsyncAction(iter) {
  function step(value) {
    var result = iter.next(value); // ask for a promise
    if (!result.done) { // if we get one
      result.value.then(step); // when the promes resolves, pass it back
    }
  }
  step(undefined);
}

These nine lines receive a promise from the generator instance, obtain the promise's resolved value with .then(), pass the result to the step runner, and send it back to the generator instance. This process recursively loops until the generator instance completes by executing its return statement or reaching the end of its body.

Error Handling

But what happens if the promise is rejected? How are errors handled inside of these tasks?

One last important feature of generators is the ability to "throw" an exception from the outside to within the generator itself. Generator instances have a method, .throw(err) which has this behavior: it triggers an exception at the time the yield expression is evaluated. This allows the task itself to just use try..catch for error handling in asynchronous code just as if it were written synchronously.

This requires a small change to our runner:

function runAsyncAction(iter) {
  function step(err, value) {
    var result;
    if (err) {
      result = iter.throw(err);
    } else {
      result = iter.next(value);
    }
    if (!result.done) {
      result.value
        .then(function (nextVal) {
          step(undefined, nextVal);
        })
        .catch(function (nextErr) {
          step(nextErr, undefined);
        });
    }
  }
  step(undefined, undefined);
}

Which you can see in action in this modified jsbin (try entering an invalid or empty username):

JS Bin on jsbin.com

Non-iterator Compatibility

While this approach seems great, unless there's compatibility to code which doesn't use generators, it's not a feasible solution to introduce into an existing codebase.

Thankfully, we can make one more modification to our task runner, to turn a task into full-fledged Promise, which eventually resolve or reject with an error. This means that we can fully abstract the use of generators within a library or portion of our code. The existing callers can just interface with it as if it were a plain old Promise.

The final task runner is surprisingly short, given how powerful it is:

function taskToPromise(iter) {
    return new Promise(function (resolve, reject) {
        function step(err, value) {
            var result;
            if (err) {
                result = iter.throw(err);
            } else {
                result = iter.next(value);
            }
            if (result.done) {
                resolve(result.value);
            } else {
                if (!result.value instanceof Promise) {
                    throw new ValueError('Non-promise value yielded by generator');
                }
                result.value
                    .then(function (newValue) {
                        step(undefined, newValue);
                    })
                    .catch(function (newError) {
                        step(newError, undefined);
                    });
            }
        }
        step(undefined, undefined);
    });
}

With a very small change to the API, we can pass generators themselves instead of a generator instance. If we do this, we essentially have the same API as co, but with much less code and type juggling:

function co(generator/*, ...args*/) {
  var args = [].slice.call(arguments, 1);
  return taskToPromise(generator.apply(null, arguments));
}

Conclusion

Using generators to write tasks are a significant improvement in writing straightforward, concise, readable, and error-aware asynchronous code. This technique can be done today in the browser or on the backend by compiling iterators to vanilla ES5 with babel and regenerator.

Afterword

If you use babel in bleeding edge mode, or use Typescript targeting ES6, this may not be news to you. The next version of ECMAScript has an async/await proposal, which introduces an new syntax to support the exact same operations as we have implemented with generators. The only difference between the two (as far as I can tell) is that the separate syntax allows you to mix generators and async functions.

Whichever approach you end up using (async/await, which has yet to be finalized; or "task" generators, which are possible with the currently specified language), both will give you the benefit of vastly simplifying your asynchronous code and error handling.