Icon A Deep Learning Framework For Character Motion Synthesis and Editing

 

Icon The Halting Problem and The Moral Arbitrator

 

Icon The Witness

 

Icon Four Seasons Crisp Omelette

 

Icon At the Bottom of the Elevator

 

Icon Tracing Functions in Python

 

Icon Still Things and Moving Things

 

Icon water.cpp

 

Icon Making Poetry in Piet

 

Icon Learning Motion Manifolds with Convolutional Autoencoders

 

Icon Learning an Inverse Rig Mapping for Character Animation

 

Icon Infinity Doesn't Exist

 

Icon Polyconf

 

Icon Raleigh

 

Icon The Skagerrak

 

Icon Printing a Stack Trace with MinGW

 

Icon The Border Pines

 

Icon You could have invented Parser Combinators

 

Icon Ready for the Fight

 

Icon Earthbound

 

Icon Turing Drawings

 

Icon Lost Child Announcement

 

Icon Shelter

 

Icon Data Science, how hard can it be?

 

Icon Denki Furo

 

Icon In Defence of the Unitype

 

Icon Maya Velocity Node

 

Icon Sandy Denny

 

Icon What type of Machine is the C Preprocessor?

 

Icon Which AI is more human?

 

Icon Gone Home

 

Icon Thoughts on Japan

 

Icon Can Computers Think?

 

Icon Counting Sheep & Infinity

 

Icon How Nature Builds Computers

 

Icon Painkillers

 

Icon Correct Box Sphere Intersection

 

Icon Avoiding Shader Conditionals

 

Icon Writing Portable OpenGL

 

Icon The Only Cable Car in Ireland

 

Icon Is the C Preprocessor Turing Complete?

 

Icon The aesthetics of code

 

Icon Issues with SDL on iOS and Android

 

Icon How I learned to stop worrying and love statistics

 

Icon PyMark

 

Icon AutoC Tools

 

Icon Scripting xNormal with Python

 

Icon Six Myths About Ray Tracing

 

Icon The Web Giants Will Fall

 

Icon PyAutoC

 

Icon The Pirate Song

 

Icon Dear Esther

 

Icon Unsharp Anti Aliasing

 

Icon The First Boy

 

Icon Parallel programming isn't hard, optimisation is.

 

Icon Skyrim

 

Icon Recognizing a language is solving a problem

 

Icon Could an animal learn to program?

 

Icon RAGE

 

Icon Pure Depth SSAO

 

Icon Synchronized in Python

 

Icon 3d Printing

 

Icon Real Time Graphics is Virtual Reality

 

Icon Painting Style Renderer

 

Icon A very hard problem

 

Icon Indie Development vs Modding

 

Icon Corange

 

Icon 3ds Max PLY Exporter

 

Icon A Case for the Technical Artist

 

Icon Enums

 

Icon Scorpions have won evolution

 

Icon Dirt and Ashes

 

Icon Lazy Python

 

Icon Subdivision Modelling

 

Icon The Owl

 

Icon Mouse Traps

 

Icon Updated Art Reel

 

Icon Tech Reel

 

Icon Graphics Aren't the Enemy

 

Icon On Being A Games Artist

 

Icon The Bluebird

 

Icon Everything2

 

Icon Duck Engine

 

Icon Boarding Preview

 

Icon Sailing Preview

 

Icon Exodus Village Flyover

 

Icon Art Reel

 

Icon LOL I DREW THIS DRAGON

 

Icon One Cat Just Leads To Another

You could have invented Parser Combinators

Created on Dec. 1, 2014, 5:19 p.m.

In my book Build Your Own Lisp I use a Parser Combinator library I wrote for C called mpc to teach readers about languages and parsing. Lots of people have asked me how this works, curious as to if they might be able to do it themselves. A few even dived into the mpc source code!

But unfortunately the implementation of mpc in C is quite complex due to various language limitations. Additionally most resources online about Parser Combinators can be quite cryptic for a beginner. This is because much of the text is geared toward people who already know Haskell, the language they were invented using. For those unfamiliar with Haskell, following along can be pretty difficult.

But the ideas behind Parser Combinator libraries are really quite simple. It is more than possible for a beginner to implement a Parser Combinator library in a high level language such as Javascript in just a few hundred lines of code. In fact, you could have invented Parser Combinators...

You start with a function which reads the first character from some input. If this first character is a, then it returns a and advances the input by one. If it isn't, it doesn't advance the input, and returns a failure.

But your boss wants this function to work for any character, not just a. Now normally you'd add a parameter to this function which specifies which character to look for, but today you try something different. You create a new function which takes this character as input, and returns the function you had previously - adapted to look for this specific character. This function you call lit.

function lit(c) {
	
	return (function(input) {
		var r = inputRead(input);
		if (r == c) {
			inputAdvance(input, 1);
			return c;
		} else {
			return failure;
		}
	});
	
}

Your boss thinks you're mad, but you think you're pretty clever, because instead of writing the function you needed, you generated it using another function.

var parser = lit('a');
var result = parser(input);

Now your boss wants you to check if the first character is one of two given characters. He tells you to stop messing around and update the original function to take two character parameters, but instead you decide to again write a function that generates the one your boss was talking about. This one takes as input two functions generated from lit, and returns a function that tries each one in turn, returning the first one that succeeds. This new function you call or.

function or(parser0, parser1) {

	return (function (input) {
		var result0 = parser0(input);
		if (result0 != failure) { return result0; }
		var result1 = parser1(input);
		if (result1 != failure) { return result1; }
		return failure;
	});
	
}

You use this in combination with lit to do what your boss asked of you.

var parser = or(lit('a'), lit('b'));
var result = parser(input);

You also make a function that takes as input two calls to lit, and returns a new function that tries each in series. If any of them fail, it fails and rewinds the input, while if both of them succeed it returns the string of both the characters joined together. This you call and.

function and(parser0, parser1) {

	return (function (input) {
		var pos = inputGet(input);
		var result0 = parser0(input);
		if (result0 == failure) { inputSet(pos); return failure; }
		var result1 = parser1(input);
		if (result1 == failure) { inputSet(pos); return failure; }
		return result0 + result1;
	});
	
}

You show your boss that this can be used with lit when you want to parse something and then something else.

var parser = and(lit('a'), lit('b'));
var result = parser(input);

Your boss says he has a new request that will break your approach. He asks you to check for either ab or cd at the start of the input, and if either one is present convert the characters to integers and return a list of them. He says this can, and should, be done with just one function, the conventional way.

Annoyingly all your functions are set up to deal with characters and strings, not integers. You wonder if you can make them more general to work across different types. You think that or is probably already quite general - it just returns the result of one parser or the other. So the problem is just with the and parser, but this can be made more general if it returns a list of the results. Now neither and or or rely on character types as output of the parsers.

function and(parser0, parser1) {

	return (function (input) {
		var pos = inputGet(input);
		var result0 = parser0(input);
		if (result0 == failure) { inputSet(pos); return failure; }
		var result1 = parser1(input);
		if (result1 == failure) { inputSet(pos); return failure; }
		return [result0, result1];
	});
	
}

This means and and or can take as input any functions that act in a given way - these are any functions that take some input and either fail, or produce some output. These type of functions you call parsing functions and they're what you've been generating. But wait a second - the functions returned by and and or are parsing functions themselves! All they do is take some input and either fail or produce output. So we can construct a parser from and or or, and because the output itself would be a parsing function we could give it as input back to and or or without worrying things will break.

var parser = or(
	and(lit('a'), lit('b')),
	and(lit('c'), lit('d')));
	
var result = parser(input);

Even so, you are still a bit unsure about how to approach conversion to integers. It would be nice if you could have a function which calls some other function with the result of a parse. Then you could just pass in a integer conversion function to it and use lit as before. No worries - we'll generate it again. You write a function which takes as input some parsing function, and some other function, and returns a function that runs the parsing function and, providing it didn't fail, returns the other function applied to the result. This function you call apply.

function apply(f, parser) {
	
	return (function (input) {
		var result = parser(input);
		if (result == failure) { return failure; }
		else { return f(result); }
	});

}

Now you can finally write your boss' request as the following.

var parser = or(
	and(apply(toInt, lit('a')), apply(toInt, lit('b'))),
	and(apply(toInt, lit('c')), apply(toInt, lit('d'))));
	
var result = parser(input);

You can even tidy up all those repeated function calls by making a lit_to_int function.

function lit_to_int(c) {
	return apply(toInt, lit(c));
}

var parser = or(
	and(lit_to_int('a'), lit_to_int('b')),
	and(lit_to_int('c'), lit_to_int('d')));
	
var result = parser(input);

To deal with daily requests you add more and more functions to your library. You add a function that takes one parser and returns a function that runs it zero or more times, collecting all the results in a list (many). You add a function takes three parsers and returns a function that runs them in order and returns the result of the middle one (between). You add utility functions for matching identifiers and for ignoring whitespace around other parsers. The more you use your library the more you realize the code you write is more or less identical to the textual specification from your boss. For complex parsing tasks you can just look up a grammar and map it more or less directly to your library. It is as you give it the problem and the solution is found as if by magic.

A sweet name is only one thing is left before you release your invention upon the world - Parser Combinators.

github twitter rss