Functional Programming & DSL’s Part 1

Introduction

I have explored DSL’s (Domain Specific Language) many moons ago and at that time I was using Boo (Python based .net language) after reading Ayende’s DSL book, it was a great choice as the language offers a lot of powerful capabilities way beyond C#, in the end I put together a DSL for log4net configuration.

While working my way through FP the topic of creating DSL’s has stood out for me when you start go down the route of FP you essentially build up a DSL around the problems your trying to solve it’s surprising the expressiveness that can be achieved using FP techniques, what I wanted to do was to see how closely I can get to a perfect DSL using FP.

Querying

If we take querying for data as our example[1], first we need some data:

var users = [
  { id: 1, username: 'jbloggs', dob: new Date(1984,1,1), displayName: 'Joe Bloggs', gender: 'm' },  
  { id: 2, username: 'zball', dob: new Date(1964,2,16), displayName: 'Zoe Ball', gender: 'f' },
  { id: 3, username: 'jsmith', dob: new Date(1957,7,23), displayName: 'John Smith', gender: 'm' },
  { id: 4, username: 'jfranco', dob: new Date(1979,3,2), displayName: 'Juan Franco', gender: 'm' },
  { id: 5, username: 'scastle', dob: new Date(1998,10,11), displayName: 'Susan Castle', gender: 'f' }
];

Now if we wanted to query from this data users that were male, over eighteen and only wanted to return the id, dob and gender, the following would be pretty close to a perfect way of expressing this requirement:

var results = query users                
              where all male, overEighteen
              select id, dob, gender             

If we were to express this in an OOP way we would probably be going down the route of a query object:

var results = new UserQuery()
                .gender('m')
                .ageGreaterThan(18)
                .query(users)
                .map(function (user) { return { id: user.id, dob: user.dob, gender: user.gender }; });

And of course you tweak the Query object and add some objects that will help make some of the querying more adaptable and less hardcoded:

var results = new Query()
                .where(new Equal('gender', 'm'))
                .where(new GreaterThan('dob', date.minusYears(18)))
                .restrict(['id', 'dob', 'gender'])
                .query(users);

But there’s a lot of ceremony going on there with objects getting created in a number of places and this is a pretty simple query, let’s see how taking an FP approach pans out, essentially what we have here is a pipeline of actions against a dataset, so lets breakdown each of the actions to see if we can achieve as close to the ideal DSL, first let’s start with select:

//+ select :: [string] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.toArray(arguments);  
  return function (items) {
    return _.map(items, function (item) { 
      return _.pick(item, props); 
    });
  };
};

var selectResult = select('id', 'dob')(users);
console.log(selectResult);

To help with some of functional lifting I’m using underscore _.toArray and _.map don’t need explanation, the _.pick function takes an object and a series of strings for every property to retrieve on the passed object.

Ok this isn’t so bad we know that select is going to be used to specify which properties on the dataset we want returned and being part of the pipeline it is going to be eventually called with the items so I have used currying to set this up. I think we can do better regarding the properties and the moment we are using string literals ideally it would be nice to not have to use strings, we could either specify some variables to act like “constants”:

var id = 'id';
var dob = 'dob';

This works but is incredibly fragile as JS does not support true constants that cannot be mutated, what if we treated them instead of strings but as functions that when executed always returned the same value, we can use the _.partial and _.identity functions to achieve this:

var id = _.partial(_.identity, 'id');
var dob = _.partial(_.identity, 'dob');

id();
//=> id

_.identity will return the exact same value that is passed to it when called, and so I partially apply this function using _.partial specifying the value I always want to return.[2] We now need to change the select function because at the moment it is expecting strings to be passed in not functions:

//+ invoke :: (fun (a) -> b) -> b
var invoke = function (fun) { return fun(arguments); };
var concatResult = _.map([id, dob], invoke);
concatResult()
//=> ["id","dob"]

Here I have defined an simple function that takes a function in and invokes it passing any arguments and returning the result, we can then test this out by using _.map and as we can see correctly returns an array of the results of each function call. Now we can use this new function inside select:

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke);
  return function (items) {
    return _.map(items, function (item) { 
      return _.pick(item, props); 
    });
  };
};

You can see that props has now been changed so that it is a map for each argument passed in, the next thing we can do is get rid of the need for the anonymous function used in the map the only reason we currently have it there now is to capture the item argument that is then passed to the _.pick function what we can do is get rid of this middle step and have the map pass the item straight through, at present this is tricky because we can’t partially apply this function because it takes the item as the first argument[3], so lets first create a reusable function to allow this:

//+ pick :: [string] -> a -> b
var pick = function (props, item) {
  return _.pick(item, props);  
};

First we create a simple function that swaps the arguments and delegates to the underscore _.pick function.

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke)
    , picker = _.partial(pick, props);
  return function (items) { 
    return _.map(items, picker); 
  };
};

Next we can now partially apply over the the new pick function supplying the properties we want to pick, this means we can remove the anonymous function, one thing that might stand out is that we have a similar situation with the returned function we are simply using the items argument to then pass onto the _.map function if we could do the same thing we can return a partially applied map, this arguments issue is becoming quite a nuisance instead of having to create a new function from scratch let’s create a reverseArgs[4] function that will return the same function but will reverse it’s arguments when called:

//+ reverseArgs :: fun -> fun -> a
var reverseArgs = function (fun) {
  return function (/* args */) {
    var args = _.toArray(arguments);
    args.reverse();
    return fun.apply(null, args);  
  };
};

//+ map :: (fun a -> b), [a] -> [b]
var map = reverseArgs(_.map);

map follows the same strategy as pick in that it swaps the arguments over and delegates, this now opens the door to allow us to partially apply:

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke)
    , picker = _.partial(pick, props);
  return _.partial(map, picker);
};

We have managed to remove all the anonymous functions completely and instead are allow functions to call other functions without having to step in between, it now becomes clear what select performs:

  • Takes a number of arguments that are expected to be functions that return string values that will be properties of an object
  • Returns a function that when called with an array items will return an array of objects with only the properties specified populated

We can now start to use this select function:

var selectUsers = select(id, dob);
_.first(selectUsers(users));
//=> {id: 1, dob: Wed Feb 01 1984 00:00:00 GMT+0000 (GMT Standard Time)}

However one thing that is really good about this is that it is in no way tied to a user object, so we can use this for any object type:

var cars = [
    { reg: 'GNS 453A', make: 'Mitsubishi', model: 'Colt Ralliart' },
    { reg: 'HJD 112E', make: 'Ford', model: 'Focus RS' },
    { reg: 'JFS 672F', make: 'Vauxhall', model: 'Corsa VXR' }
];
    
var reg = _.partial(_.identity, 'reg');
var make = _.partial(_.identity, 'make');
    
var selectCars = select(reg, make);
_.first(selectCars(cars));
//=> {reg: "GNS 453A", make: "Mitsubishi"}

As long as I abide by the contract of what select needs it works perfectly for any object. In the next part I will look at where and how we can use it to filter results.


  1. Note that we are using a simple array and an object literal (treated like a map) rather than a user object like we would use in OOP.
  2. I know this is also susceptible to someone changing the variable to another function returning a different value, I have not shown it in the code examples but for all of these they should be inside there own scope which should lessen the likelihood of accidental changes.
  3. The argument ordering is one major annoyance with the underscore.js library as it seems to promote using _.chain based OOP style API rather than partial application & composition FP style, in fact there is a great video on youtube by Brian Lonsdorf which I highly recommend that goes into this in a lot more detail.
  4. We will also be using this for later parts.
Advertisements