20 Jan 2013

Functional programming with Javascript

This post will discuss what functional programming is, how it compares to object-oriented programming and how you can program functionally in Javascript.

Themes of functional programming

Throughout the examples and concepts discussed in this post there are three key themes:

Computation as the application of functions
Statelessness
Avoiding side effects

Computation as the application of functions

Object-oriented programming also uses functions but functional programming is unique in that functions are values (the technical term for this is that functions are “first class”). You can store functions in variables, return them as the value of other functions and pass them in as arguments to other functions. As a result functional programs produce answers by applying these functions in different ways.

Statelessness

Programming languages have the notion of “state”. This is a snapshot of a program’s current environment: all the variables that have been declared, functions created and what is currently being executed.

An expression in a programming language can be “stateful” or “stateless”. A stateful expression is one that changes a program’s current environment. A very simple example in Javascript would be incrementing a variable by 1:

var number = 1;
var increment = function() {
    return number += 1;
};
increment();

A stateless expression, on the other hand, is one that does not modify a program’s environment:

var number = 1;
var increment = function(n) {
    return n + 1;
};
increment(number);

Unlike the previous example when we call increment here we do not change the value of number we merely return a new value.

Object-oriented programming works by changing program state to produce answers. Conversely, functional programming produces answers through stateless operations. The clearest contrast can be seen in how both methodologies handle looping. An object-oriented approach would be to use a for loop:

var loop = function() {
    for (var x = 0; x < 10; x += 1) {
        console.log(x);
    }
};
loop();

The loop produces its results by constantly changing the value of x. A functional approach would involve recursion (covered in more detail later on):

var loop = function(n) {
    if (n > 9) {
        console.log(n);
        return;
    } else {
        console.log(n);
        loop(n + 1);
    }
};
loop(0);

In this case we get our results by calling the loop function each time with a new value rather than modifying the program’s state.

So, functional programming requires us to write stateless expressions and keep our data immutable.

Avoiding side effects

When a function executes it can change a program’s state aside from returning a value:

var number = 2;
var sideEffect = function(n) {
    number = n * 3;
    return number * n;
};
sideEffect(3);

The side effect of this function is that it changes the value of number each time it executes. Functional programming promotes the use of “pure functions”: functions that have no side effects. In the example above changing the value of number would be disallowed under this paradigm. If we can always write functions that have no side effects our program becomes “referentially transparent”. This means if we input the same value into a program we will always get the same result.

The example above is not referentially transparent because if we always called sideEffect with 3 we would get a different result each time.

Pure functional programming in Javascript

If we wanted to program in a pure functional manner in Javascript we can set the following rules:

No assignment i.e. changing the value of something already created with = (var statements with = are fine)
No for or while loops (use recursion instead)
Freeze all objects and arrays (since modifying an existing object or array would be a stateful expression)
Disallow Date (since it always produces a new value no matter what the inputs are)
Disallow Math.random (same reason as Date)

Why program functionally?

After looking at the themes of functional programming and the rules we need to set to ensure we can write pure programs, it may seem like we need to jump through a lot of hoops just to get anything done. Indeed functional programming does look inferior when faced with environments that are constantly in flux. This is particularly true of the DOM (one of Javascript’s main interactions) which can constantly change through insertions, deletions, animations etc. In fact the creators of the Haskell programming language developed Monads as a way of managing changing state but retaining functional purity (Douglas Crockford also gave a talk on implementing Monad’s in Javascript).

The big advantage functional programming gives us is that we can reason about and test our programs much more easily. This can help us to be more productive, reduce bugs and write better software. It is not always possible to write everything in a purely functional manner but it is something to strive for. John Carmack of Id Software wrote a good article discussing the practical considerations of functional programming.

Concepts in functional programming

First class and higher-order functions

We’ve already discussed first class functions and how they can be stored and passed around as values. Of course these are very simple expressions in Javascript:

var funcAsValue = function(y) {
    return y * 4;
};

var returnFunc = function(z) {
    return function(a) {
        return a * z;
    };
};

var funcAsArgument = function(a, func) {
    return func(a);
};

The latter two expressions are examples of “higher-order” functions. These are functions that either return functions as their result or take other functions as arguments.

Closures

In Javascript functions can access variables in their outer scope:

var func = function() {
    var a = 3;
    return function(b) {
        return b * a;
    };
};

var closure = func();
closure(4) // 12

In this example the function returned by closure can still access a since it is in its outer scope.

When functions return they “take a picture” of their current environment which they can read the latest values from. So, in essence, closures are functions which, as well as having an expression body and taking arguments, also record an outer scope or environment. Importantly, closures always have access to the latest value of a variable. If we rewrite the example above to always increment a we can see how this is the case (although this breaks our functional principles):

var func = function() {
    var a = 3;
    return function(b) {
        a = ++a;
        return b * a;
    };
};

var closure = func();
closure(4) // 16
closure(5) // 25

Recursion

A recursive function is a function that quite simply calls itself:

var recursive = function(n) {
    if (n < 1) {
        return n;
    } else {
        return recursive(n - 1);
    }
};

recursive(1); // 0

Functional programming favours recursion for looping rather than traditional while or for loops since we can maintain statelessness and avoid side effects. For example, to compute the factorial of value n statefully we might do:

var factorial = function(n) {
    var result = 1;
    for (var x = 1; x <= n; x += 1) {
        result = x * result;
    }
    return result;
};

factorial(5) // 120

A stateless approach would be:

var recursiveFactorial = function(n) {
    if (n < 2) {
        return n;
    } else {
        return n * recursiveFactorial(n - 1);
    }
};

recursiveFactorial(5) // 120

Tail call optimisation

Most functional programming languages implement “tail call optimisation” for recursive functions. In order to understand this it would be best to begin by looking at what happens when recursion is not optimised in this manner.

Each time a function is called additional space is taken up on the call stack. If you’re using a Javascript environment that supports console.trace logging you can see the call stack for the recursiveFactorial function above:

var recursiveFactorial = function(n) {
    if (n < 2) {
        console.trace();
        return n;
    } else {
        console.trace();
        return n * recursiveFactorial(n - 1);
    }
};

recursiveFactorial(5);

This stack is built up so when a value is returned the program knows where to go back to.

A “tail call” is a call to a function that occurs as the last action of another function. Our recursive calls above are not tail calls since the final action of the function is to multiply what is returned by recursiveFactorial by n. Let’s change this around so they are:

var recursiveFactorial = function(n) {
    var recur = function(x, y) {
        if (y === n) {
            console.trace()
            return x * y;
        } else {
            console.trace()
            return recur(x * y, y + 1);
        }
    }
    return recur(1, 1);
};

recursiveFactorial(5);

Since Javascript does not implement tail call optimisation we can see in the console that we build up the stack each time recur is called.

In programming languages that do support tail call optimisation a large call stack like this would not be built up. The language can determine that if a call is in tail position extra space does not need to be created for it on the stack each time it is called. A Wikipedia example illustrates the procedure quite nicely.

Unfortunately since Javascript does not implement tail call optimisation recursion is much slower than traditional for or while loops. However, in ECMAScript 6 tail calls will be optimised.

Continuation passing style

For the time being we can still write loops in a recursive manner without taking up extra space on the call stack. This can be done with continuation passing style (CPS).

In CPS we use “continuations” (functions) to represent the next stage of a computation. A continuation is passed as an argument to a higher-order function which is then called by that function when a previous operation is complete. This pattern is very common in Javascript. AJAX requests and Node.js use CPS so that computation can be performed asynchronously. For example:

$.get('url', function(data) {
    console.log(data);
});

In this example we pass a continuation to our get method when the data has been retrieved.

A more complex example is factorial in CPS:

var CPSFactorial = function(n, cont) {
    if (n < 2) {
        return cont(n);
    } else {
        var new_cont = function(v) {
            var result = v * n;
            return cont(result);
        };
        return CPSFactorial(n - 1, new_cont);
    }
};

CPSFactorial(5, function(v) {
    return v;
});

With our original recursive factorial example, once we reached the end of the “loop” each result had to be returned to its original caller meaning the program had to go all the way back through the stack to get the final result.

On the other hand, with CPS once the “loop” is at an end the function returns the result of applying the continuation with 1. This in turn applies the result of that function to another continuation and so on until we get the final result. So, rather than going all the way back through the call stack the CPSFactorial function creates new functions that each represent the next stage of the computation.

Thunks and trampolines

With CPS there is still some recursion: CPSFactorial calls itself several times during the computation. We can avoid recursion completely with “thunks” and “trampolines”.

A thunk is much like a continuation except that it is stored as a data structure for use at a later time. A trampoline manages the execution and return values of thunks (so called because it bounces thunks around). In Javascript we can represent a thunk as follows:

var thunk = function (f, lst) {
    return { tag: "thunk", func: f, args: lst };
};

var thunkValue = function (x) {
    return { tag: "value", val: x };
};

In this case our thunk stores a function and a list of arguments to be executed at a later time. Thanks to closures thunks have a record of their environment when they are stored. The thunkValue function simply returns the final value of our computation.

We can handle all of the thunks we create with a trampoline:

var trampoline = function (thk) {
    while (true) {
        if (thk.tag === "value") {
            return thk.val;
        }
        if (thk.tag === "thunk") {
            thk = thk.func.apply(null, thk.args));
        }
    }
};

Here trampoline takes a thunk and calls its function with its arguments if it has the correct tag otherwise it returns a value. This example breaks our rule of statelessness (since we reassign thk to a new value each time) so we could write this as follows:

var trampoline = function (thk) {
    if (thk.tag === "value") {
        return thk.val;
    }
    if (thk.tag === "thunk") {
        return trampoline(thk.func.apply(null, thk.args));
    }
};

However, we are then back to using recursion again.

We can use thunks and a trampoline to calculate factorial:

var thunk = function (f, lst) {
    return { tag: "thunk", func: f, args: lst };
};

var thunkValue = function (x) {
    return { tag: "value", val: x };
};

var thunkFactorial = function(n, cont) {
    if (n < 2) {
        return thunk(cont, [n]);
    } else {
        var new_cont = function(v) {
            var result = v * n;
            return thunk(cont, [result]);
        };
        return thunk(thunkFactorial, [n - 1, new_cont]);
    }
};

var trampoline = function (thk) {
    while (true) {
        if (thk.tag === "value") {
            return thk.val;
        }
        if (thk.tag === "thunk") {
            thk = thk.func.apply(null, thk.args);
        }
    }
};

trampoline(thunkFactorial(5, thunkValue));

In this example we avoid recursion completely since we’re always creating new thunks to be executed rather than functions invoking themselves.

A fun (but pointless) implementation of this is a jQuery plugin that can act as a replacement for jQuery’s each method.

Optimisations in functional programming

Partial application (currying)

Since Javascript has higher-order functions we can call functions with some of its arguments pre-filled. This is known as partial application or currying. The simplest example of this is Javascript’s bind method. bind simply returns a new function with a custom context and arguments pre-filled:

var bindExample = function(a, b) {
    return a / b;
};

var test = bindExample.bind(null, 3);

test(4); // 0.75

Here test creates a new function which when invoked already has the a argument filled in as 3.

bind is an ECMAScript 5 method so it is not available in all Javascript environments. We can write a custom bind as follows:

var bind = function(a, func) {
    return function() {
        return func.apply(null, [a].concat(Array.prototype.slice.call(arguments)));
    };
};

var bindExample = function(a, b) {
    return a / b;
};

var test = bind(3, bindExample);

test(4); // 0.75

Exactly like the ES5 method, the bind function in this case returns a new function which when called applies the function we passed into it with an argument pre-filled.

Memoisation

Functions that are expensive to run can be optimised with memoisation. This involves using a closure to cache the results of previous calls to the function. A very simple example would be:

var memoise = function(func) {
    var cache = {};
    return function(arg) {
        if (arg in cache) {
            console.log(arg + ' in cache');
            return cache[arg];
        } else {
            console.log(arg + 'not in cache');
            return cache[arg] = func(arg);
        }
    };
};

var memo = memoise(function(n) {
    return n * 2;
});

memo(1) // 2 (1 not in cache)
memo(1) // 2 (1 in cache)

Here we pass a function to memoise whose results we wish to cache. When we call memo it will either retrieve the value from the cache or it will call the memoised function and store its value in the cache. It’s worth noting that in order for us to be able to memoise a function it must be referentially transparent otherwise no results will ever be cached.

Of course this example breaks our rule of avoiding side effects. Unfortunately it is quite problematic to use memoisation without changing state since we have to keep updating our cache. It is easier in a language like Haskell that can use lazy evaluation to memoise.

Memoisation is most useful where a function is likely to be calculating many of the same results such as the fibonacci sequence.

Conclusion

This article has looked at functional programming’s key themes, its main concepts and optimisations the paradigm can make.

Whilst functional programming has many benefits including more rigorous code and enhancements to productivity, it can be difficult at times to write programs in a purely functional manner in Javascript. Consequently, some pragmatism is required.

In a future post I hope to discuss how all of this can be tied together to create a functional application.