Parsinator

Parsinator lets you build small well-defined parsers in JavaScript or TypeScript which can be combined together to accomplish just about any parsing task.

Find it on GitHub: sufianrhazi/parsinator
Install it via: npm install parsinator

What it does

Parsinator uses parser combinators to build structured data from string input. Unlike other ways of parsing data, Parser Combinators are:

  • Maintainable: designed to be read and written by humans, unlike regular expressions which are designed to be executable by machines.
  • Reusable: complex parsers are built from smaller pieces, which are each responsible for parsing individual parts.
  • Debuggable: parse failures provide a detailed error message which shows what the parser was expecting.
  • Powerful: can match/extract data which is impossible (like equal nesting) for regular expressions to parse.

Parsinator is inspired by the excellent parsec Haskell library.

In short, parsers are building blocks which take a parser state (string input and offset) and produce a value and a new parser state. Parsinator allows you to easily define these building blocks with generator functions.

Documentation is in the readme

Target platforms

  • TypeScript (at least 3.6.0)
  • ES2015 modules / commonjs modules / standalone file

A simple example

Let's parse a hello world-style greeting and gauge its excitement level:

const greeting = Parsinator.fromGenerator(function *() {
    const intro = yield* Parsinator.regex(/[hH]ello, /);
    const who = yield* Parsinator.until(Parsinator.str("!"));
    const exclamations = yield* Parsinator.many(Parsinator.str("!"));
    return {
      who: who,
      excitement: exclamations.length
    };
});

Parsinator.runToEnd(greeting, "Hello, Parsinator!");
// { who: "Parsinator", excitement: 1 }
Parsinator.runToEnd(greeting, "hello, there!!!!");
// { who: "there", excitement: 4 }

A powerful example

Let's build a parser which evaluates a mathematical expression:

const spaces = Parsinator.regex(/\s*/);
const token = (parser) => Parsinator.surround(spaces, parser, spaces);
const float = token(Parsinator.regex(/[0-9]+(\.[0-9]*)?/));
const number = Parsinator.map(float, (str) => parseFloat(str));

function makeOpParser(opstr, action) {
  const opParser = token(Parsinator.str(opstr));
  return Parsinator.map(opParser, (_str) => action);
}

const neg = makeOpParser('-', (val) => -val);
const fac = makeOpParser('!', (val) => {
  if (val < 0) {
    throw new Error('Factorial on a negative number: ' + val);
  }
  if ((val|0) !== val) {
    throw new Error('Factorial on a non-integer: ' + val);
  }
  let acc = 1;
  for (; val > 0; val--) acc *= val;
  return acc;
});
const sum = makeOpParser('+', (x, y) => x + y);
const sub = makeOpParser('-', (x, y) => x - y);
const mul = makeOpParser('*', (x, y) => x * y);
const div = makeOpParser('/', (x, y) => {
  if (y === 0) throw new Error('Division by zero');
  return x / y;
});
const exp = makeOpParser('^', (x, y) => Math.pow(x, y));

const evalMath = Parsinator.buildExpressionParser([
    { fixity: "prefix", parser: neg },
    { fixity: "postfix", parser: fac },
    { fixity: "infix", associativity: "right", parser: exp },
    { fixity: "infix", associativity: "left",  parser: mul },
    { fixity: "infix", associativity: "left",  parser: div },
    { fixity: "infix", associativity: "left",  parser: sum },
    { fixity: "infix", associativity: "left",  parser: sub }
], () => Parsinator.choice([
    Parsinator.surround(
      token(Parsinator.str("(")),
      evalMath,
      token(Parsinator.str(")"))
    ),
    number
]));

This parser is able to parse and evaluate any mathematical expression involving negation, exponents, multiplication, and summation. Go ahead and try it out!

A practical example

Let's parse structured data out of a name/email/url string:

"Abba Cadabra <abba@cadabra.com> (http://magic.website/)"

const emailParser = Parsinator.between(Parsinator.str("<"), Parsinator.str(">"));

const urlParser = Parsinator.between(Parsinator.str("("), Parsinator.str(")"));

const infoParser = Parsinator.fromGenerator(function *() {
    const name = yield* Parsinator.until(Parsinator.choice([
        Parsinator.str("<"),
        Parsinator.str("("),
        Parsinator.end
    ]));
    const email = yield* Parsinator.maybe(emailParser);
    yield* Parsinator.regex(/\s*/);
    const url = yield* Parsinator.maybe(urlParser);
    yield* Parsinator.end;
    return {
        name: name.trim(),
        email: email,
        url: url
    };
});

Parsinator.run(infoParser, "Parsinator");
// { name: "Parsinator", email: null, url: null }

Parsinator.run(infoParser,
  "Abba Cadabra <abba@cadabra.com> (http://magic.website)");
// { name: "Abba Cadabra", email: "abba@cadabra.com", url: "http://magic.website" }

Parsinator.run(infoParser,
  "Béla Bartók (https://www.britannica.com/biography/Bela-Bartok)"
);
// { name: "Béla Bartók" email: null, url: "https://www.britannica.com/biography/Bela-Bartok" }
Parsinator.run(infoParser,
  "马云 <jack@1688.com>"
);
// { name: "马云", email: "jack@1688.com", url: null }

Go ahead and try it out!