Functional Programming & DSL’s Part 3

Tidying up our DSL

In the last part we added filtering of results via the where function and showed how we can compose functions to enable expressive and re-usable criteria but we were left with an somewhat awkward way of calling our functions together, what we need is a function to stitch them together.

Creating query

The 2 functions we have created so far (select & where) when executed return another function that then accepts an array of objects (in our example this is user objects) essentially this allows us to configure how we want the functions to behave before sending them the actual users, this is good because it means that we can treat each function as abiding by the same contract that it will accept an array of objects and return an array of objects so our query function simply needs to call each function in turn and pass the results from one to the next[1].

//+ query :: ([a], [(fun [a] -> [b])]) -> [b]
var query = function (items /*, funs */) {
  var funs = _.rest(arguments);
  return _.reduce(funs, function (results, fun) {
    return fun(results);  
  }, items);
};

The first argument is our array of objects, after that is the functions that will be called with the array of objects, we are using the _.reduce function like we did in the all function in the previous part to call each function in turn and capture the results for every call to feed into the next, this can then be called:

query(users,
      where (all (female, overEighteen)),
      select (id, dob, gender));
//=> [Object]

This is a lot easier to consume and is very close to our ideal DSL, there is still some noise which would be great to get rid of namely parentheses and the semi-colon at the end, lets see how we can improve this.

Enter CoffeeScript

CoffeeScript is a language that transcompiles to JS it has a ton of various features and also cuts down massively on the amount of noise needed compared to standard JS (at present[2]), we can write our expression above as:

query users, 
      where(all female, overEighteen) 
      select id, dob, gender
//=> [Object]

I could almost get rid of all the parentheses but found that if I removed it from the where it short circuited and did not execute the remaining functions, I think this is because it struggles to parse on the all function, so you could rewrite it to this if were really keen:

criteria = all female, overEighteen
query users, 
      where criteria
      select id, dob, gender
//=> [Object]

This is probably as close as we can get to our ideal DSL for querying and I think it’s pretty damn close!

var results = query users                
              where all female, overEighteen
              select id, dob, gender

Extending our DSL

Now we have our building blocks we can start to extend the language of our DSL in this example I’m going to demonstrate how we can add ordering into our query, so we should then be able to write the following:

query users, 
      where(all male, overEighteen)
      orderBy dob
      select id, dob, gender
//=> [Fri Aug 23 1957 00:00:00 GMT+0100 (GMT Daylight Time), Mon Apr 02 1979 00:00:00 GMT+0100 (GMT Daylight Time), Wed Feb 01 1984 00:00:00 GMT+0000 (GMT Standard Time)]

To implement orderBy we can use the _.sortBy function which takes in an array of objects and can either sort on a provided function or on a string name for a property we can use the second approach for this and reuse our functions we use for the select function that return us hardcoded strings:

var sortBy = reverseArgs(_.sortBy);

//+ orderBy :: (fun -> string) -> [a] -> [a]
var orderBy = function (prop) {
  return _.partial(sortBy, prop());      
};

We partially apply over the sortBy function and pass it the returned string from the call to the prop function supplied.

Closing Thoughts

I hope this has been a helpful demonstration of how we can use FP with JS (and some CS to get a cleaner syntax) to enable creation of DSL’s I know some may be thinking that this seems like quite a bit of work to get working however I think the following need to be taken into account:

  • A lot of the functions we ended it up writing were general purpose and will either be re-used in other areas or already be provided for in other FP libraries functions like reverseArgs would not even be needed if underscore had it’s arguments geared up for partial application and prop are going to be useful in lots of other places.
  • Like I demonstrated in part 1 this query DSL is not specific to any particular object type and can be applied to any objects this is in contrast to OOP which tends to be built with specific types to work against
  • The actual code needed when lumped together is 84 lines which is not a great amount[3] considering the nice maintainable DSL that your left with

The final code can be found on jsfiddle


  1. You could also call this function pipeline but query fits closer to the querying domain
  2. If you look at what’s in the pipeline for JS a lot of the features CS offers will be added directly to JS, also after spending time looking at FP with JS some of the features offered by CS become less in demand (array comprehensions for example if you have map)
  3. I would imagine using some other FP libraries and having someone more experience with FP than I’am you could probably halve this 🙂
Advertisements

Functional Programming & DSL’s Part 1

Introduction

I have explored DSL’s (Domain Specific Language) many moons ago and at that time I was using Boo (Python based .net language) after reading Ayende’s DSL book, it was a great choice as the language offers a lot of powerful capabilities way beyond C#, in the end I put together a DSL for log4net configuration.

While working my way through FP the topic of creating DSL’s has stood out for me when you start go down the route of FP you essentially build up a DSL around the problems your trying to solve it’s surprising the expressiveness that can be achieved using FP techniques, what I wanted to do was to see how closely I can get to a perfect DSL using FP.

Querying

If we take querying for data as our example[1], first we need some data:

var users = [
  { id: 1, username: 'jbloggs', dob: new Date(1984,1,1), displayName: 'Joe Bloggs', gender: 'm' },  
  { id: 2, username: 'zball', dob: new Date(1964,2,16), displayName: 'Zoe Ball', gender: 'f' },
  { id: 3, username: 'jsmith', dob: new Date(1957,7,23), displayName: 'John Smith', gender: 'm' },
  { id: 4, username: 'jfranco', dob: new Date(1979,3,2), displayName: 'Juan Franco', gender: 'm' },
  { id: 5, username: 'scastle', dob: new Date(1998,10,11), displayName: 'Susan Castle', gender: 'f' }
];

Now if we wanted to query from this data users that were male, over eighteen and only wanted to return the id, dob and gender, the following would be pretty close to a perfect way of expressing this requirement:

var results = query users                
              where all male, overEighteen
              select id, dob, gender             

If we were to express this in an OOP way we would probably be going down the route of a query object:

var results = new UserQuery()
                .gender('m')
                .ageGreaterThan(18)
                .query(users)
                .map(function (user) { return { id: user.id, dob: user.dob, gender: user.gender }; });

And of course you tweak the Query object and add some objects that will help make some of the querying more adaptable and less hardcoded:

var results = new Query()
                .where(new Equal('gender', 'm'))
                .where(new GreaterThan('dob', date.minusYears(18)))
                .restrict(['id', 'dob', 'gender'])
                .query(users);

But there’s a lot of ceremony going on there with objects getting created in a number of places and this is a pretty simple query, let’s see how taking an FP approach pans out, essentially what we have here is a pipeline of actions against a dataset, so lets breakdown each of the actions to see if we can achieve as close to the ideal DSL, first let’s start with select:

//+ select :: [string] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.toArray(arguments);  
  return function (items) {
    return _.map(items, function (item) { 
      return _.pick(item, props); 
    });
  };
};

var selectResult = select('id', 'dob')(users);
console.log(selectResult);

To help with some of functional lifting I’m using underscore _.toArray and _.map don’t need explanation, the _.pick function takes an object and a series of strings for every property to retrieve on the passed object.

Ok this isn’t so bad we know that select is going to be used to specify which properties on the dataset we want returned and being part of the pipeline it is going to be eventually called with the items so I have used currying to set this up. I think we can do better regarding the properties and the moment we are using string literals ideally it would be nice to not have to use strings, we could either specify some variables to act like “constants”:

var id = 'id';
var dob = 'dob';

This works but is incredibly fragile as JS does not support true constants that cannot be mutated, what if we treated them instead of strings but as functions that when executed always returned the same value, we can use the _.partial and _.identity functions to achieve this:

var id = _.partial(_.identity, 'id');
var dob = _.partial(_.identity, 'dob');

id();
//=> id

_.identity will return the exact same value that is passed to it when called, and so I partially apply this function using _.partial specifying the value I always want to return.[2] We now need to change the select function because at the moment it is expecting strings to be passed in not functions:

//+ invoke :: (fun (a) -> b) -> b
var invoke = function (fun) { return fun(arguments); };
var concatResult = _.map([id, dob], invoke);
concatResult()
//=> ["id","dob"]

Here I have defined an simple function that takes a function in and invokes it passing any arguments and returning the result, we can then test this out by using _.map and as we can see correctly returns an array of the results of each function call. Now we can use this new function inside select:

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke);
  return function (items) {
    return _.map(items, function (item) { 
      return _.pick(item, props); 
    });
  };
};

You can see that props has now been changed so that it is a map for each argument passed in, the next thing we can do is get rid of the need for the anonymous function used in the map the only reason we currently have it there now is to capture the item argument that is then passed to the _.pick function what we can do is get rid of this middle step and have the map pass the item straight through, at present this is tricky because we can’t partially apply this function because it takes the item as the first argument[3], so lets first create a reusable function to allow this:

//+ pick :: [string] -> a -> b
var pick = function (props, item) {
  return _.pick(item, props);  
};

First we create a simple function that swaps the arguments and delegates to the underscore _.pick function.

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke)
    , picker = _.partial(pick, props);
  return function (items) { 
    return _.map(items, picker); 
  };
};

Next we can now partially apply over the the new pick function supplying the properties we want to pick, this means we can remove the anonymous function, one thing that might stand out is that we have a similar situation with the returned function we are simply using the items argument to then pass onto the _.map function if we could do the same thing we can return a partially applied map, this arguments issue is becoming quite a nuisance instead of having to create a new function from scratch let’s create a reverseArgs[4] function that will return the same function but will reverse it’s arguments when called:

//+ reverseArgs :: fun -> fun -> a
var reverseArgs = function (fun) {
  return function (/* args */) {
    var args = _.toArray(arguments);
    args.reverse();
    return fun.apply(null, args);  
  };
};

//+ map :: (fun a -> b), [a] -> [b]
var map = reverseArgs(_.map);

map follows the same strategy as pick in that it swaps the arguments over and delegates, this now opens the door to allow us to partially apply:

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke)
    , picker = _.partial(pick, props);
  return _.partial(map, picker);
};

We have managed to remove all the anonymous functions completely and instead are allow functions to call other functions without having to step in between, it now becomes clear what select performs:

  • Takes a number of arguments that are expected to be functions that return string values that will be properties of an object
  • Returns a function that when called with an array items will return an array of objects with only the properties specified populated

We can now start to use this select function:

var selectUsers = select(id, dob);
_.first(selectUsers(users));
//=> {id: 1, dob: Wed Feb 01 1984 00:00:00 GMT+0000 (GMT Standard Time)}

However one thing that is really good about this is that it is in no way tied to a user object, so we can use this for any object type:

var cars = [
    { reg: 'GNS 453A', make: 'Mitsubishi', model: 'Colt Ralliart' },
    { reg: 'HJD 112E', make: 'Ford', model: 'Focus RS' },
    { reg: 'JFS 672F', make: 'Vauxhall', model: 'Corsa VXR' }
];
    
var reg = _.partial(_.identity, 'reg');
var make = _.partial(_.identity, 'make');
    
var selectCars = select(reg, make);
_.first(selectCars(cars));
//=> {reg: "GNS 453A", make: "Mitsubishi"}

As long as I abide by the contract of what select needs it works perfectly for any object. In the next part I will look at where and how we can use it to filter results.


  1. Note that we are using a simple array and an object literal (treated like a map) rather than a user object like we would use in OOP.
  2. I know this is also susceptible to someone changing the variable to another function returning a different value, I have not shown it in the code examples but for all of these they should be inside there own scope which should lessen the likelihood of accidental changes.
  3. The argument ordering is one major annoyance with the underscore.js library as it seems to promote using _.chain based OOP style API rather than partial application & composition FP style, in fact there is a great video on youtube by Brian Lonsdorf which I highly recommend that goes into this in a lot more detail.
  4. We will also be using this for later parts.

Easier log4net configuration is here

I have just released a version of a new project I have been working on after being inspired by Ayende’s ‘Building Domain Specific Languages in Boo‘ e-book, the project is aimed at making log4net configuration easier by using a DSL instead of XML it’s called log4net-altconf and is hosted on google code you can grab the source using a SVN client, I would recommend TortoiseSVN.

It’s early days so some of the syntax will need to be ironed out and there will no doubt be other things that need to be looked at, hopefully someone with better knowledge with this type of thing can give some pointers 🙂

I could probably have had this done a few days back however I was wrestling with NUnit’s addin’s in order to have a set of test DSL’s to test the main DSL that set me back quite a bit!

If you have any comments please feel free to post them here.

Macro Woes with DSL in Boo

Lately I have been working my way through Ayende’s Building Domain Specific Languages in Boo, it’s a really good read and would recommend it to anyone wanting to put together their own DSL and also to learn about the meta programming that Boo has to offer.

Anyway this afternoon I spent a long time trying to get a macro to work for my DSL using Rhino DSL, in the end I think it was a number of issues, so here is a list of things to check if your macro is not working:

  1. Make sure that the namespace of where the macro is declared is passed thorugh to the ImplicitBaseClassCompilerStep, this can be done in the constructor, which takes a string array of namespaces.
  2. When you name the macro it must be in pascal case, I was falling foul because in my DSL I was referring to it as add_foo so named it add_fooMacro, this doesn’t work it must be Add_fooMacro.
  3. If the macro is declared in a seperate assembly to the rest of your DSL classes you will need to add a reference to the assembly for the Boo compiler.
  4. This one is a general rule, don’t try to use macros inside macros (using, lock etc…) they ain’t gonna work!

As for the DSL project I’m working on… watch this space :o)