Leveraging value objects & domain services

Introduction

Over the years I have come to see and also been guilty of not modelling concepts that should be treated as such in the domain your working in and instead what this leads to is 2 primary code smells:

The first manifests itself as values that get passed around together but are always taken by functions as standalone parameters there is usually a breakage of DRY here with knowledge of what the parameters represent scattered around the codebase. The second relates to a refusal to try and apply object thinking and instead procedures are created with no real intent as to where the behaviour should reside and because were in an OOP language they have to live somewhere right? So we end up with badly named classes that just become containers of procedures (and if you’re really unlucky they deal with a whole host of unrelated responsibilities!)

Resolving

So how can we resolve these issues? Well as it turns out we can usually stamp out both by introducing a Value Object that properly represents the concept in our domain by making this first step we can usually start moving behaviour related to this new concept from the *Util class into the actual object here it becomes a much more rich object with it’s own behaviour instead of just being a dumb data holder.

There are however times when certain behaviour cannot just be moved inside the new concept object and for these cases you will probably want to introduce a Domain Service object the difference here between the Util class vs. the Domain Service is that it is specific for a certain operation that you want to perform and as such can be properly named around the current domain your working in.

Example

The example I’m going to show has been a really common scenario I have found while working on various financial systems for a number of years.

In a financial domain you will have a concept of Money it will consist of:

  • Amount
  • Currency

Seems fairly straightforward and it is, however lets look at how I typically see this concept dealt with in code.

public static string FormatMoney(IDictionary<string, int> currencyDecimalPlacesLookup, decimal amount, string currency, bool includeCurrency)
{
    // lookup for decimal places
    // string.Format using amount with found decimal places and currency flag
}

public static string FormatCurrencyDescription(Dictionary<string, string> currencyDescLookup, string currency)
{
    // lookup by currency code
}

public static decimal ConvertGBP(Dictionary<string, decimal> xrates, string currency, decimal amount)
{
   // lookup rate by currency code
   // apply conversion
}

Here is an example of their usage:

string balance = MoneyUtil.FormatMoney(_decimalPlacesLookup, account.PostedBalance, account.CurrencyCode);

string currDesc = MoneyUtil.FormatCurrencyDescription(_currDescLookup, account.CurrencyCode);

decimal inGBP = MoneyUtil.ConvertGBP(_xrateLookup, account.CurrencyCode, account.PostedBalance);

So to start with these probably were just inlined inside one of the calling functions but then someone saw they needed to do the same thing somewhere else and created the MoneyUtil class hooray for reuse! As you could imagine these will then continue to grow and become dumping grounds for any behaviour related to Money, you can already see that unrelated responsibilities are being introduced formatting & currency conversion have no reason to live together, you can also see that the caller is having to manage lookup dictionaries which is another sign of primitive obsession and that modelling concepts are being missed.

As described in the introduction we can see that certain values are being passed around together in this case the amount & currency however we have not captured the intent of these values by introducing a Money object, instead we will introduce the concepts of the domain and see how that changes things:

public class Currency : IEquatable<Currency>
{
	public Currency(string code, int decimalPlaces, string description)
	{
		Code = code;
		DecimalPlaces = decimalPlaces;
		Description = description;
	}

	public string Code { get; private set; }
	public int DecimalPlaces { get; private set; }
	public string Description { get; private set; }
	
	public override string ToString()
	{
		return Code;
	}

// ... emitted rest of implementation
}

First we have the concept of a Currency this may seem like a pointless object however even just doing this step alone would change our signatures for the functions above so that no longer do they just accept a magic string for currency code but instead a properly defined Currency object that already has its information encapsulated in once place.

There a few items to note when creating Value Objects:

  • They are immutable
  • Identity is based off their properties (I have not shown this here but you can see in the full source)
public class Money : IEquatable<Money>
{	
	public Money(decimal amount, Currency currency)
	{
		Amount = amount;
		Currency = currency;
	}

	public decimal Amount { get; private set; }
	public Currency Currency { get; private set; }

// ... emitted rest of implementation

	public override string ToString()
	{
		return string.Format("{0} {1}", Currency, FormatAmount());
	}

	private string FormatAmount()
	{
		decimal val = Amount * (decimal)Math.Pow(10, Currency.DecimalPlaces);
		val = Math.Truncate(val);
		val = val / (decimal)Math.Pow(10, Currency.DecimalPlaces);
		return string.Format("{0:N" + Math.Abs(Currency.DecimalPlaces) + "}", val);
	}
}

We can see that the Money object has to be supplied a Currency at creation and that it now has the smarts to know how to format correctly using the details of its Currency, if you wanted more control over the formatting you could provide an overload to take in a formatter object.

With the changes made above we have nearly made the MoneyUtil class redundant however there is the operation of converting existing Money to Money in another Currency (in this case GBP) this operation is prime example were a Domain Service is a good fit and captures the concept of converting Money from one Currency to another Currency.

First we can define an interface to represent the operation.

public interface ICurrencyConverter
{
    Money ConvertTo(Money from);
}

This captures the operation using the language of the domain and were already getting the benefit of our Money and Currency objects by eliminating primitive obsession were no longer passing strings and decimals but instead fully realised domain concepts.

Next we can implement a CurrencyConverter to convert to a specific Currency.

public class ToCurrencyConverter : ICurrencyConverter
{
	private readonly Currency _toCurrency;
	private readonly IDictionary<Currency, decimal> _rates;

	public ToCurrencyConverter(Currency toCurrency, IDictionary<Currency, decimal> rates)
	{
		_toCurrency = toCurrency;
		_rates = rates;
	}

	public Money ConvertTo(Money from)
	{
		if (!_rates.ContainsKey(from.Currency))
		{
			throw new InvalidOperationException(string.Format("Could not find rate from currency: {0} to: {1}", from.Currency, _toCurrency));
		}

		decimal rate = _rates[from.Currency];
		return new Money(from.Amount * rate, _toCurrency);
	}
}

On creation we provide the converter the Currency we want to convert to and also the rates that should be used, these would typically be populated daily and then cached for faster access although these are concerns way outside of the domain model and it does not concern itself with how these are provided.

With this final piece of the puzzle complete we can now take joy in removing our MoneyUtil class and instead having a richer domain model.

I have published code to go along with this post that can be found here https://github.com/stavinski/joasd-domainconcepts

Can’t we do better than null?

So you want to retrieve an aggregate from a repository (assuming DDD) stored in some data store so you make a call not unlike the following:

Product product = productRepository.FindById(id);

You then go to execute some method on the Product aggregate and boom the dreaded NRE (Null Reference Exception) great, now we can just go to tried and tested !== null check however I want to explore different ways we can actually deal with values that may or may not be present and make this more explicit and give the consumer a better experience.

Is it an exceptional circumstance?

I guess the first question we could ask is would it be considered exceptional that we cannot find the aggregate we are looking for, in the example above we are using the id to find the aggregate which is about as specific as we can get and in most cases the id would have been assigned against a selectable item so not being able to retrieve the aggregate but likely be:

  1. Infrastructure issue in which case you would get an exception thrown at the infrastructure level
  2. The id has been tampered with to make it invalid or non accessible
  3. The aggregate has been removed

Now a case could be made that these are pretty exceptional conditions, however if we change is so that we are trying to retrieve an aggregate based off a natural id for instance a SKU then it becomes a non exceptional circumstance as it could just be that we have not assigned to the SKU to the product or has been done incorrectly.

I also find that your in the same boat with regard to the consumers point of view as the null return but in this case you will get a custom exception thrown at you which is slightly better (checked exceptions in java would make it explicit it could throw).

Return result with output

The next way we could make it more explicit to the caller is by changing the signature to this:

bool FindById(ProductId id, out Product product);

This goes back to the C days if you check out some of the WinAPI you would typically call a function that would return the result of the call and assign a value to a pointer passed in, thankfully the C# is a bit nicer but from a consumer point of view it is explicit but generally not a nice API to expose:

Product product = null;
bool found = productRepository.FindById(id, out product);

Personally I have never liked using out parameters I think mainly due to the declaring of the output parameter(s) before making the call.

Null Object

Another option is to have the repository return a Null Object if it cannot find the aggregate, I don’t think this option would be suitable for a repository as aggregates tend to be treated in much higher regard than optional objects however this strategy can work well in other scenarios where you could provide a default behaving object.

Why returning collections causes no problem

If we look at another method on our Product repository:

IList<Product> FindByCategory(ProductCategoryId productCategoryId);

When we use this method and it doesn’t find any Products for the category id supplied we don’t have the same problem as the FindById instead we just have an empty list (there’s a special place in hell reserved for people who return null out of these type of methods) so our consumer does not need to perform anything differently:

var foundProducts = productRepository.FindByCategory(productCategoryId);
foreach (var product in foundProducts)
{
    // do stuff here
}

The key difference between this signature and the FindById is that the actual products are now housed inside a container (in this case an implementation of IList<Product>) in order to get to the Product we have to go through the container.

This is one area where functional languages have the edge as they don’t have the concept of a null instead you have to be explicit by using a Maybe/Option and the consumer is forced to handle what to do in both cases (this is made easier by pattern matching).

What if we took a similar approach for returning our single Product.

Using a Tuple

The simplest way we could achieve this is by using a Tuple:

Tuple<bool, Product> FindById(ProductId id);

Now we are returned a result that gives us a container in that provides us with more context we can now use this as a consumer like this:

var result = productRepository.FindById(id);
if (result.Item1)
{
    // product was found
}
else
{
    // product not found
}

Now the consumer has a bit more to go on with regard to how the repository behaves and could probably assume that the first item corresponds to whether a Product has been found, as mentioned in the last section functional languages have pattern matching which makes this a bit more explicit, here is an example of what C# could look like if it had pattern matching:

var result = productRepository.FindById(id);
switch (result)
{
    case (false, _):
        // product not found
    case (true, product):
        // product found
}

Query Result

The Tuple approach is a step in the right direction however it’s not very expressive what if  we were to create our own object that represents the results of a query ideally as a consumer we could then use it like this:

QueryResult<Product> result = productRepository.FindById(id);
if (result.HasResult)
{
    var product = result.Result;
    // product found
}
else
{
    // product not found
}

// or if your feeling functional
result.Found(product => // product found);
result.Missing(() => // product not found);

I have shown 2 different API’s depending on the consumers preference, we have taken the approach of returning a container so the consumer must go through this in order to get to the Product, this makes it really explicit on how you deal with the result and no surprises.

I have pushed up a working version of this object to https://github.com/stavinski/joasd-queryresult so feel free to take it and do what you want with it.

How to deploy a compiled iOS static library that uses other libraries

I recently ran into an issue that had me stumped for quite some time and wanted to capture the solution here both to help anyone in a similar situation and for future reference, or if someone wants to provide a comment to let me know I’m doing it all wrong and point me in the right direction!

What I wanted to achieve was to create a compiled static library that could then be published to cocoapods for easy use now there are plenty of resources that describe how to create a static library that uses other libraries and deploy it as source, one of the better ones I was using was at sigmapoint however none describe how to deploy as a compiled library, hence why I’m writing this post.

After working my way through the posts above I had my library with a Podfile, that had my dependency:

pod 'AFNetworking', '~> 2.2'

And had set my MyLib.podspec to have the listed dependency:

s.dependency 'AFNetworking', '~> 2.2'

Then in my app I added MyLib to the Podfile:

pod 'MyLib', :path => '../pods'

And then came across around 270 duplicate symbols errors, sigh.

My reasoning as to why this was happening is that in the MyLib cocoapods is compiling the AFNetworking code into libPods.a and this is being used to produce MyLib.a, then when the app is being compiled it too also has the reference to AFNetworking and is also trying to compile it into libPods.a however as it is alread in MyLib.a it runs into duplicate symbols errors.

So in order to solve this I did the following:

  1. Changed MyLib so it did not use a Podfile and instead just brought down just the header files for AFNetworking for the version I’m targeting and added them to the search path config
  2. Removed binary link to libPods.a from MyLib
  3. Did a pod update against my app and hey presto it build successfully!

What this does is allow MyLib to compile using the header files of AFNetworking but to not compile any of the concrete AFNetworking code into MyLib and instead AFNetworking only gets compiled in at the app compilation stage.

The importance of performing spike solutions

Firstly we should discuss what is a spike solution? Well according to The Art of Agile Development[Shore & Warden 2007]:

A spike solution, or spike, is a technical investigation. It’s a small experiment to research the answer to a problem.

The Extreme Programming site defines them as:

A spike solution is a very simple program to explore potential solutions. Build the spike to only addresses the problem under examination and ignore all other concerns.

Spikes

These two definitions line up perfectly and also go on to present one major point that must be made before continuing:

Spike/Spike solutions should never be committed into the main codebase

They should be treated as throw away code, once you have your answer it has served it’s purpose (if you must have the code under source control make sure it’s completely separate to the main trunk!).

I also want to clarify how I see the difference between spike solutions and prototyping which could appear very similar, prototyping usually encompasses a much larger goal such as putting some quick static front-end screens together to gauge the UX whereas spike solutions in contrast help to answer a specific technical question such as will EF be able to map to our legacy Users table correctly. This means that spike solutions should require a lot less time to complete and we should probably be time-boxing how long we spend on a particular spike to ensure this.

Now that we have defined what a spike solution is I want to go through a subset real world example (with name changes) that demonstrates there effectiveness when certain situations arise.

XMHELL

Back Story

Fubar DIY Stores Ltd has an e-commerce site that lists all the products available that can be bought, when a product is displayed reviews are also shown and new ones submitted, this was managed in house previously but now they have decided to use a popular global third party service Haveyoursay and they have assured them that they can provide a like for like data match to what we currently have stored.

So in order to bring in the reviews from Haveyoursay twice a day the reviews will be exported as an XML file, we have been tasked with using this exported XML file to import the reviews into the Fubar DIY Stores database so that they can then be retrieved along with the product data.

Making a start

For this example we will only be concentrating on the import of the XML file and ignoring the subsequent steps.

So we have completed a rough architecture of all the moving parts we roughly know which objects we need an how they need to collaborate with each other, essentially there will be a coordinating object that knows what steps need to be performed in which order, this will be the ReviewImportCoordinator and has a method that will start the import PerformImport taking in the XML file path.

We know also that we need an object that ReviewImportCoordinator will collaborate with to read the XML file and bring back a trusty XmlDocument object that we can then use to parse the data we need and then save to the DB.

So we start writing our unit tests first for ReviewImportCoordinator and stub our IXmlReviewFileReader this has a method that takes the import file path and returns us our XmlDocument object and we continue with our unit testing of the import process.

Setting ourselves up for a fall

It seems were doing everything right, we have broken the responsibilities up into separate objects and are using TDD/BDD against our import process. However we have jumped the gun here and are making some big assumptions about how we go about reading the XML file which will have an impact on how our ReviewImportCoordinator does it’s work.

Just enough design

This is were people new to agile get it wrong and start jumping into the code rather than doing some design up front, agile does not tell you to do any design is tells us not to big design up front were we try and guess everything about the system before any code is written.

Our first task should be to get a copy of the XML file this will get rid of our assumptions about how to handle the XML import, so after chatting to the stakeholder we get a copy of the XML file and good job we did as we hit a potential hurdle, the file is around 450MB this new discovery should start us asking questions:

  • Is this a one off or are they all going to be around this size?
  • How much memory will be used if we load this into an XML DOM?
  • Will the machines that are running the import be impacted by the extra memory usage?

After asking the stakeholder for some other import files they are also around 450MB so this seems to be the expected size for each import, so now we can move onto our crucial question How much memory will be used if we load this into an XML DOM? until we get the answer we have no way of knowing whether the will be an impact on the memory usage.

Time for a spike solution

This is the ideal time for us to write a spike to discover the answer, so we knock together a very quick console app with some hacked together code in the Main method, that simply loads an XmlDocument using one of our supplied import files as the input and a Console.ReadLine() so that it waits for input to allow us to open up task manager and discover how much memory the process is using (we just need a ballpark figure otherwise we could use some profiling tools to get more insight).

static void Main(string[] args) 
{
    XmlDocument.Load(@"c:\haveyoursay\imports\import.xml");
    Console.ReadLine();
}

Getting Feedback

So after we run our spike solution we find that the process is using around 1GB of memory to load the import XML into a DOM, we now have a confident number that we can go back to our stakeholder with in order to find out what impact this will have on the machine running the import.

After discussing with the stakeholder it turns out this machine is already being used to perform other jobs and will suffer badly from having that much memory being taken away from these jobs, so we have to look at streaming the XML file rather than loading it into a DOM so we need to use XmlReader rather than XmlDocument so we can now start to unit test using this knowledge and heading down the right path from the start.

Summary

I hope this demonstrates how we can use spike solutions with a little design up front to help steer us in the right direction, this example was done at the time of implementation you can also use spike solutions as part of estimating when you have to use a unfamiliar technology, library, protocol etc… It can give you a quick way of gauging how difficult it is perform certain tasks to give you a bit more confidence in your estimates.

So next time your stuck with a technical question a spike solution could be just what your after!

Functional Programming & DSL’s Part 3

Tidying up our DSL

In the last part we added filtering of results via the where function and showed how we can compose functions to enable expressive and re-usable criteria but we were left with an somewhat awkward way of calling our functions together, what we need is a function to stitch them together.

Creating query

The 2 functions we have created so far (select & where) when executed return another function that then accepts an array of objects (in our example this is user objects) essentially this allows us to configure how we want the functions to behave before sending them the actual users, this is good because it means that we can treat each function as abiding by the same contract that it will accept an array of objects and return an array of objects so our query function simply needs to call each function in turn and pass the results from one to the next[1].

//+ query :: ([a], [(fun [a] -> [b])]) -> [b]
var query = function (items /*, funs */) {
  var funs = _.rest(arguments);
  return _.reduce(funs, function (results, fun) {
    return fun(results);  
  }, items);
};

The first argument is our array of objects, after that is the functions that will be called with the array of objects, we are using the _.reduce function like we did in the all function in the previous part to call each function in turn and capture the results for every call to feed into the next, this can then be called:

query(users,
      where (all (female, overEighteen)),
      select (id, dob, gender));
//=> [Object]

This is a lot easier to consume and is very close to our ideal DSL, there is still some noise which would be great to get rid of namely parentheses and the semi-colon at the end, lets see how we can improve this.

Enter CoffeeScript

CoffeeScript is a language that transcompiles to JS it has a ton of various features and also cuts down massively on the amount of noise needed compared to standard JS (at present[2]), we can write our expression above as:

query users, 
      where(all female, overEighteen) 
      select id, dob, gender
//=> [Object]

I could almost get rid of all the parentheses but found that if I removed it from the where it short circuited and did not execute the remaining functions, I think this is because it struggles to parse on the all function, so you could rewrite it to this if were really keen:

criteria = all female, overEighteen
query users, 
      where criteria
      select id, dob, gender
//=> [Object]

This is probably as close as we can get to our ideal DSL for querying and I think it’s pretty damn close!

var results = query users                
              where all female, overEighteen
              select id, dob, gender

Extending our DSL

Now we have our building blocks we can start to extend the language of our DSL in this example I’m going to demonstrate how we can add ordering into our query, so we should then be able to write the following:

query users, 
      where(all male, overEighteen)
      orderBy dob
      select id, dob, gender
//=> [Fri Aug 23 1957 00:00:00 GMT+0100 (GMT Daylight Time), Mon Apr 02 1979 00:00:00 GMT+0100 (GMT Daylight Time), Wed Feb 01 1984 00:00:00 GMT+0000 (GMT Standard Time)]

To implement orderBy we can use the _.sortBy function which takes in an array of objects and can either sort on a provided function or on a string name for a property we can use the second approach for this and reuse our functions we use for the select function that return us hardcoded strings:

var sortBy = reverseArgs(_.sortBy);

//+ orderBy :: (fun -> string) -> [a] -> [a]
var orderBy = function (prop) {
  return _.partial(sortBy, prop());      
};

We partially apply over the sortBy function and pass it the returned string from the call to the prop function supplied.

Closing Thoughts

I hope this has been a helpful demonstration of how we can use FP with JS (and some CS to get a cleaner syntax) to enable creation of DSL’s I know some may be thinking that this seems like quite a bit of work to get working however I think the following need to be taken into account:

  • A lot of the functions we ended it up writing were general purpose and will either be re-used in other areas or already be provided for in other FP libraries functions like reverseArgs would not even be needed if underscore had it’s arguments geared up for partial application and prop are going to be useful in lots of other places.
  • Like I demonstrated in part 1 this query DSL is not specific to any particular object type and can be applied to any objects this is in contrast to OOP which tends to be built with specific types to work against
  • The actual code needed when lumped together is 84 lines which is not a great amount[3] considering the nice maintainable DSL that your left with

The final code can be found on jsfiddle


  1. You could also call this function pipeline but query fits closer to the querying domain
  2. If you look at what’s in the pipeline for JS a lot of the features CS offers will be added directly to JS, also after spending time looking at FP with JS some of the features offered by CS become less in demand (array comprehensions for example if you have map)
  3. I would imagine using some other FP libraries and having someone more experience with FP than I’am you could probably halve this :)

Functional Programming & DSL’s Part 2

Previously…

We managed to get a select function that enabled us to work towards our ideal querying DSL, we also found that although in this example we are using user objects the select function can be used against any object type and demonstrates that in FP a lot more emphasis is put on functions working against specific data structures and that the objects are not associated with particular types but become simple datasets.

Moving onto where

The next step is to see how we can restrict results, if we re-acquaint ourselves with our ideal DSL for querying it looked like this:

var results = query users                
              where all male, overEighteen
              select id, dob, gender

To start with if we simplify the where function so that it deals with a single predicate[1] we end up with:

//+ where :: (fun a -> boolean) -> [a] -> [a]
var where = function (pred) {
  return function (items) {
    return _.filter(items, pred);      
  };
};

We utilise the _.filter function which as you can guess takes an array of items and then a predicate and will return a new array of the items that pass the predicate supplied, as with the other underscore functions we are having to write more code because of the argument ordering, I will use the reverserArgs function I created in part 1 to solve this:

var filter = reverseArgs(_.filter);

var where = function (pred) {
  return _.partial(filter, pred);
};

Our new where can partially apply over the predicate supplied and then we just need the items to be provided. This can now be called like so:

var male = where(function (u) { return u.gender === 'm'; });
_.first(male(users));
//=> {id: 1, username: "jbloggs", dob: Wed Feb 01 1984 00:00:00 GMT+0000 (GMT Standard Time), displayName: "Joe Bloggs", gender: "m"…}

This works well for a single predicate however we want to be able to specify multiple predicates to filter on any number of criteria and this is were all comes into play:

//+ all :: [(fun a -> boolean)] -> a -> boolean
var all = function (/* preds */) {
  var preds = _.toArray(arguments);
  return function (item) {
    return _.reduce(preds, function (result, next) {
      return result && next(item);    
    }, true);
  };
};
  1. It accepts a number of predicates that will be used to test an individual item against
  2. A closure is returned that accepts a single item
  3. We use the _.reduce function to enumerate our predicates the first thing we check is that we are still returning true for the current item and if so we check the item with the next predicate[2]
  4. The end result will either be true or false depending if the item met all the predicate conditions or not

I have a feeling there is a better way this can be expressed in a more succinct way and would be happy if anyone can demonstrate this in the comments :) Let’s take it for a test drive:

var male = function (u) { return u.gender === 'm'; };
var firstnameIsJuan = function (u) { return (/^juan/i).test(u.displayName); };
var criteria = all (male, firstnameIsJuan);
_.map(users, criteria);
//=> [false, false, false, true, false]

Although a bit convoluted it does demonstrate that the 3rd user in our array is correctly being identified as being male and with a first name of ‘Juan’, because this conforms to a predicate we can now plug it into our where function:

var results = where(all (male, firstnameIsJuan));
_.first(results(users));
//=> {id: 4, username: "jfranco", dob: Mon Apr 02 1979 00:00:00 GMT+0100 (GMT Daylight Time), displayName: "Juan Franco", gender: "m"…}

The results have 1 item that is as expected our 3rd entry in the users array.

Composing Expressions

One of the nice aspects of this approach is how you can compose together expressions to make them fit the problems we are trying to solve in a nice readable way and also enable re-use, say for starters we have some functions already written to check values:

//+ isMale :: string -> boolean
var isMale = function (gender) { return gender === 'm'; };

//+ checkAgeAgainstDate :: (number, date) -> boolean
var checkAgeAgainstDate = function (age, date) {
  return new Date(
          new Date().setFullYear(
              new Date().getFullYear() - age)) > date; 
};

Now at the moment these functions take in primitive values we are using user objects that have these primitive values inside them, our first try at re-using the functions above may look like this:

//+ male :: a -> boolean
var male = function (u) { 
  return isMale(u.gender); 
};

//+ overEighteen :: a -> boolean
var overEighteen = function (u) {
  return checkAgeAgainstDate(18, u.dob);
};

This works but were having to create brand new functions to wrap up the call to the helper functions, notice that in both cases all I’m doing is extracting the correct property from the user object, if I can wrap this action up in a function I can then compose this new function against the corresponding helper function:

//+ prop :: (string, a) -> b
var prop = function (prop, obj) { return obj[prop]; };

It actually ends up being a really simple function thanks to the dynamic nature of JS, we can now create some functions that partially apply over the correct property:

//+ getDob :: a -> date
var getDob = _.partial(prop, "dob");

//+ getGender :: a -> string
var getGender = _.partial(prop, "gender");

I have used a get* prefix so as to not to confuse with the existing functions we use for selecting the properties we want in the select function we saw in part 1[3], Now we use these to compose against:

//+ male :: a -> boolean
var male = _.compose(isMale, getGender);

//+ overEighteen :: a -> boolean
var overEighteen = _.compose(_.partial(checkAgeAgainstDate, 18), getDob);

The _.compose function reads from right to left and makes a new function as you’d expect that will call pass getGender into the isMale function (i.e. equiv. to male(isMale(getGender(user)))[4] the overEighteen function is a little different as I’m actually using partial application again to bind the checkAgeAgainstDate function to use 18 for the age argument I have inlined it here but if you were going to use an over 18 check in other places you could assign it to a new function.

Another example of were composition comes in handy can be demonstrated if we want to return all the female users, typically this would be the approach:

//+ isFemale :: string -> boolean
var isFemale = function (gender) { return gender !== 'm'; };

However we can utilise the existing isMale function an isFemale function is basically the opposite of the isMale function:

//+ not :: boolean -> boolean
var not = function (val) { return !val; };

//+ isFemale :: string -> boolean
var isFemale = _.compose(not, isMale);

First we define a very handy function not that will return the opposite of any value passed to it then using this we can compose the isMale function against not, we can now define and use our female function for querying against:

//+ female :: a -> boolean
var female = _.compose(isFemale, getGender);
var results = where(all (female));
results(users).length;
//=> 2

We now have our where and select functions available but to use them at the moment is quite cumbersome having to use _.compose:

var query = _.compose(select(id, dob), where(all (male)));

In the next part we will look at how to stitch them together and look at how we can get a nice looking DSL.


  1. A predicate in this context is a function that takes a value and returns true or false depending if it meets a certain criteria
  2. This allows us to short circuit the predicate check using the && operator
  3. I could have also scoped them into an object:
    var userprops = {
      dob: _.partial(prop, "dob"),
      gender: _.partial(prop, "gender")
    }
    var male = _.compose(isMale, userprops.gender);
    
  4. If you want to find out more about compose I did a post on it and also it’s cousin sequence

Functional Programming & DSL’s Part 1

Introduction

I have explored DSL’s (Domain Specific Language) many moons ago and at that time I was using Boo (Python based .net language) after reading Ayende’s DSL book, it was a great choice as the language offers a lot of powerful capabilities way beyond C#, in the end I put together a DSL for log4net configuration.

While working my way through FP the topic of creating DSL’s has stood out for me when you start go down the route of FP you essentially build up a DSL around the problems your trying to solve it’s surprising the expressiveness that can be achieved using FP techniques, what I wanted to do was to see how closely I can get to a perfect DSL using FP.

Querying

If we take querying for data as our example[1], first we need some data:

var users = [
  { id: 1, username: 'jbloggs', dob: new Date(1984,1,1), displayName: 'Joe Bloggs', gender: 'm' },  
  { id: 2, username: 'zball', dob: new Date(1964,2,16), displayName: 'Zoe Ball', gender: 'f' },
  { id: 3, username: 'jsmith', dob: new Date(1957,7,23), displayName: 'John Smith', gender: 'm' },
  { id: 4, username: 'jfranco', dob: new Date(1979,3,2), displayName: 'Juan Franco', gender: 'm' },
  { id: 5, username: 'scastle', dob: new Date(1998,10,11), displayName: 'Susan Castle', gender: 'f' }
];

Now if we wanted to query from this data users that were male, over eighteen and only wanted to return the id, dob and gender, the following would be pretty close to a perfect way of expressing this requirement:

var results = query users                
              where all male, overEighteen
              select id, dob, gender             

If we were to express this in an OOP way we would probably be going down the route of a query object:

var results = new UserQuery()
                .gender('m')
                .ageGreaterThan(18)
                .query(users)
                .map(function (user) { return { id: user.id, dob: user.dob, gender: user.gender }; });

And of course you tweak the Query object and add some objects that will help make some of the querying more adaptable and less hardcoded:

var results = new Query()
                .where(new Equal('gender', 'm'))
                .where(new GreaterThan('dob', date.minusYears(18)))
                .restrict(['id', 'dob', 'gender'])
                .query(users);

But there’s a lot of ceremony going on there with objects getting created in a number of places and this is a pretty simple query, let’s see how taking an FP approach pans out, essentially what we have here is a pipeline of actions against a dataset, so lets breakdown each of the actions to see if we can achieve as close to the ideal DSL, first let’s start with select:

//+ select :: [string] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.toArray(arguments);  
  return function (items) {
    return _.map(items, function (item) { 
      return _.pick(item, props); 
    });
  };
};

var selectResult = select('id', 'dob')(users);
console.log(selectResult);

To help with some of functional lifting I’m using underscore _.toArray and _.map don’t need explanation, the _.pick function takes an object and a series of strings for every property to retrieve on the passed object.

Ok this isn’t so bad we know that select is going to be used to specify which properties on the dataset we want returned and being part of the pipeline it is going to be eventually called with the items so I have used currying to set this up. I think we can do better regarding the properties and the moment we are using string literals ideally it would be nice to not have to use strings, we could either specify some variables to act like “constants”:

var id = 'id';
var dob = 'dob';

This works but is incredibly fragile as JS does not support true constants that cannot be mutated, what if we treated them instead of strings but as functions that when executed always returned the same value, we can use the _.partial and _.identity functions to achieve this:

var id = _.partial(_.identity, 'id');
var dob = _.partial(_.identity, 'dob');

id();
//=> id

_.identity will return the exact same value that is passed to it when called, and so I partially apply this function using _.partial specifying the value I always want to return.[2] We now need to change the select function because at the moment it is expecting strings to be passed in not functions:

//+ invoke :: (fun (a) -> b) -> b
var invoke = function (fun) { return fun(arguments); };
var concatResult = _.map([id, dob], invoke);
concatResult()
//=> ["id","dob"]

Here I have defined an simple function that takes a function in and invokes it passing any arguments and returning the result, we can then test this out by using _.map and as we can see correctly returns an array of the results of each function call. Now we can use this new function inside select:

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke);
  return function (items) {
    return _.map(items, function (item) { 
      return _.pick(item, props); 
    });
  };
};

You can see that props has now been changed so that it is a map for each argument passed in, the next thing we can do is get rid of the need for the anonymous function used in the map the only reason we currently have it there now is to capture the item argument that is then passed to the _.pick function what we can do is get rid of this middle step and have the map pass the item straight through, at present this is tricky because we can’t partially apply this function because it takes the item as the first argument[3], so lets first create a reusable function to allow this:

//+ pick :: [string] -> a -> b
var pick = function (props, item) {
  return _.pick(item, props);  
};

First we create a simple function that swaps the arguments and delegates to the underscore _.pick function.

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke)
    , picker = _.partial(pick, props);
  return function (items) { 
    return _.map(items, picker); 
  };
};

Next we can now partially apply over the the new pick function supplying the properties we want to pick, this means we can remove the anonymous function, one thing that might stand out is that we have a similar situation with the returned function we are simply using the items argument to then pass onto the _.map function if we could do the same thing we can return a partially applied map, this arguments issue is becoming quite a nuisance instead of having to create a new function from scratch let’s create a reverseArgs[4] function that will return the same function but will reverse it’s arguments when called:

//+ reverseArgs :: fun -> fun -> a
var reverseArgs = function (fun) {
  return function (/* args */) {
    var args = _.toArray(arguments);
    args.reverse();
    return fun.apply(null, args);  
  };
};

//+ map :: (fun a -> b), [a] -> [b]
var map = reverseArgs(_.map);

map follows the same strategy as pick in that it swaps the arguments over and delegates, this now opens the door to allow us to partially apply:

//+ select :: [(fun () -> string)] -> [a] -> [b]
var select = function (/* props */) {
  var props = _.map(_.toArray(arguments), invoke)
    , picker = _.partial(pick, props);
  return _.partial(map, picker);
};

We have managed to remove all the anonymous functions completely and instead are allow functions to call other functions without having to step in between, it now becomes clear what select performs:

  • Takes a number of arguments that are expected to be functions that return string values that will be properties of an object
  • Returns a function that when called with an array items will return an array of objects with only the properties specified populated

We can now start to use this select function:

var selectUsers = select(id, dob);
_.first(selectUsers(users));
//=> {id: 1, dob: Wed Feb 01 1984 00:00:00 GMT+0000 (GMT Standard Time)}

However one thing that is really good about this is that it is in no way tied to a user object, so we can use this for any object type:

var cars = [
    { reg: 'GNS 453A', make: 'Mitsubishi', model: 'Colt Ralliart' },
    { reg: 'HJD 112E', make: 'Ford', model: 'Focus RS' },
    { reg: 'JFS 672F', make: 'Vauxhall', model: 'Corsa VXR' }
];
    
var reg = _.partial(_.identity, 'reg');
var make = _.partial(_.identity, 'make');
    
var selectCars = select(reg, make);
_.first(selectCars(cars));
//=> {reg: "GNS 453A", make: "Mitsubishi"}

As long as I abide by the contract of what select needs it works perfectly for any object. In the next part I will look at where and how we can use it to filter results.


  1. Note that we are using a simple array and an object literal (treated like a map) rather than a user object like we would use in OOP.
  2. I know this is also susceptible to someone changing the variable to another function returning a different value, I have not shown it in the code examples but for all of these they should be inside there own scope which should lessen the likelihood of accidental changes.
  3. The argument ordering is one major annoyance with the underscore.js library as it seems to promote using _.chain based OOP style API rather than partial application & composition FP style, in fact there is a great video on youtube by Brian Lonsdorf which I highly recommend that goes into this in a lot more detail.
  4. We will also be using this for later parts.